简体   繁体   English

如何正确输入dask Dataframe作为函数的参数?

[英]How to correctly input dask Dataframe as parameter to a function?

When i am passing a Dask Dataframe as parameter he is converting to pandas Dataframe. 当我将Dask Dataframe作为参数传递时,他正在转换为pandas Dataframe。

print(type(sellout_df))
simulate_sku_predictions(sellout_df.loc[(sellout_df['sku'] == sku) & (sellout_df['store_id'] == store)].compute(), store, sku)

Prints => <class 'dask.dataframe.core.DataFrame'> 打印=> <class 'dask.dataframe.core.DataFrame'>

Entering on defined function 输入定义的功能

def simulate_sku_predictions(sellout_sku_df, store, sku):
    print(type(sellout_sku_df))

Prints => <class 'pandas.core.frame.DataFrame'> 打印=> <class 'pandas.core.frame.DataFrame'>

I can't use compute and dask functions. 我不能使用计算和快捷功能。

I'm new to dask but i don't think it's appropriate converting in the middle of the code if "i don't have to". 我是个新手,但是如果“我不必”,我认为在代码中间进行转换是不合适的。

dask.DataFrame.compute() returns a Pandas DataFrame . dask.DataFrame.compute()返回Pandas DataFrame Thus, the code is not passing a Dask DataFrame to simulate_sku_predictions . 因此,该代码不被传递DASK数据帧到simulate_sku_predictions The argument, 论据

sellout_df.loc[(sellout_df['sku'] == sku) & (sellout_df['store_id'] == store)].compute()

is evaluated to a Pandas DataFrame before it is passed as an argument to simulate_sku_predictions . 在将Pandas DataFrame作为simulate_sku_predictions的参数传递之前,先将其评估。

If you remove the call to compute() , then sellout_df.loc[...] will be a Dask DataFrame, and you could pass that to simulate_sku_predictions . 如果您删除调用compute()然后sellout_df.loc[...]将是一个DASK数据框,你可以传递到simulate_sku_predictions

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM