简体   繁体   中英

Compute list of dask delayed object

I have gone all similar question and solutions provided, but not getting desired output.

I have a list of dask delayed objects.

for i in my_list:
   projection = Projection(self.expression, i)
   fi = projection.decode()
   var.append(fi)

where Projection is a class having decode function which has @dask.delayed decorator. Inside this function, we are fitting random-forest.

Var is:

[Delayed('decode-82afe417-9d1e-48ff-95a3-02ddc90c6970'), 
Delayed('decode-0a872626-996a-4a19-8b45-b39acb44257f'), 
Delayed('decode-cfa53fd4-cf5b-47f1-a672-440dc5f5ca35'), 
Delayed('decode-29cf7f51-2e7a-4c9d-8ac0-bc2259d50b6f'), 
Delayed('decode-2edc8324-f9df-4402-a1ed-44a6a9067f1d'), 
Delayed('decode-05de7417-49a5-40b7-8098-f2aad50bd934'), 
Delayed('decode-80916f08-2d28-4811-9ab4-e526af978aac'), 
Delayed('decode-da4a8874-77b5-4d75-aede-c96b5e73e888'), 
Delayed('decode-1c1fe7f0-a32b-4a0a-9d13-bb45710a3738')

Now I want to compute this var and want to get a list or array or dataframe. For that purpose, I tried various options:

option1

dask.compute(*var)

option2

v = dask.array.from_array(np.array(var), chunks=(100,))
dask.array.compute(*v)

option3

v = dask.array.from_delayed(np.array(var))
dask.array.compute(*v)

option4

v = dask.array.from_delayed(np.array(var))
v.compute()

but in all cases, either I get again the list of delayed objects or time out.

Thanks in advance.

Option 1 appears to be the most appropriate one, Options 3 and 4 will result in a list of delayed objects because in those options v contains nested delayed objects.

It would help to know more details about the setup (local/distributed), data magnitude, computation intensity, and the activity on the dask dashboard.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM