Home »
Python
Dask Array in Python
By Bestha Radha Krishna Last updated : December 21, 2024
Python Dask Array
Dask is parallel computing python library and it is mainly used to run across multiple systems. Dask is used to process the data efficiently on a different cluster of machines. Dask can completely use all the cores available in the machine.
Dask stores the complete data on the disk and uses chunks of data from the disk for processing. Dask analyzes the large data sets with the help of Pandas dataframe and "numpy arrays".
Basically, dask arrays are distributed "numpy arrays". A large "numpy array" is divided into smaller arrays and they are grouped together to form dask array.
Installing Python dask library
Install using this command
pip install dask
Creating a dask array
Dask array.asarray is used to convert the given input into dask array. It converts lists, tuples, numpy array to dask array.
Example: Create a dask array
import dask.array as p
rk = [1, 2, 3, 4, 5]
# converts the list into dask array
d = p.asarray(rk)
print(d.compute())
# print type of d
print(type(d))
r = (1, 2, 3)
# converts the tuple into dask array
k = p.asarray(r)
print(k.compute())
# print type of k
print(type(k))
Output
[1 2 3 4 5]
<class 'dask.array.core.Array'>
[1 2 3]
<class 'dask.array.core.Array'>
Another Example
import dask.array as p
import numpy as np
# create a numpy array
r = np.arange(5)
print(r) # print type of numpy array
print(type(r)) # converting numpy array to dask array
d = p.asarray(r)
print(d.compute())
print(type(d))
t = np.array([1, 2, 3])
print(t) # print type of numpy array
print(type(t)) # converting numpy array to dask array
f = p.asarray(t)
print(f.compute()) # print type of dask array
print(type(f))
Output
[0 1 2 3 4]
<class 'numpy.ndarray'>
[0 1 2 3 4]
<class 'dask.array.core.Array'>
[1 2 3]
<class 'numpy.ndarray'>
[1 2 3]
<class 'dask.array.core.Array'>