简体   繁体   English

dask.compute Dask DataFrame 类型的所有值作为值存储在字典中

[英]dask.compute all values of Dask DataFrame type that are stored as values in a dictionary

I understand that if I store many Dask dataframes in a list I can compute all of them in parallel as我知道如果我将许多 Dask 数据帧存储在一个列表中,我可以并行计算所有这些数据帧

result = dask.compute(*container_list)

but how would I do something similar if I store the Dask dataframe results as values in a dictionary?但是如果我将 Dask dataframe 结果作为值存储在字典中,我将如何做类似的事情? (If containe_dict is a dictionary (如果containe_dict是字典

result = dask.compute(*container_dict) 

would not work.)不会工作。)

The best I could do was loop over the dictionary with a container, but this is not ideal since we are now running dask.compute multiple times rather than once.我能做的最好的事情就是用一个容器遍历字典,但这并不理想,因为我们现在多次运行dask.compute而不是一次。

container_dict = {}
for index, value in enumerate(comb_dict_stock):
    container_dict[index] = ddf.loc[index] # index ddf to get the row for index and value in dict

# compute all the dask dataframes in container_dict
for key, value in container_dict.items():
    container_dict[key] = value.compute()

dask.compute can accept a dictionary and evaluate only the dask objects inside: dask.compute可以接受字典并仅评估内部的 dask 对象:

from dask import compute
from dask.datasets import timeseries
test = {'a': timeseries(freq='1h'), 'b': 123}
result, = compute(test)
print(type(result))
# <class 'dict'>

Note that compute returns a tuple of results, so to store just the dictionary of interest use tuple assignment.请注意,计算返回一个结果元组,因此要仅存储感兴趣的字典,请使用元组分配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM