简体   繁体   中英

How to convert MultiIndex Pandas dataframe to Dask Dataframe

I am trying to convert a pandas dataframe that is MultiIndexed on two variables (an ID and a DateTime variable) to dask dataframe however I get the following error;

"NotImplementedError: Dask does not support MultiIndex Dataframes" 

I am using the following code

import pandas as pd
import dask.dataframe as dd

dask_df = dd.from_pandas(pandas_df)

Actually, I have over 700 pandas dataframes (each over 100 MB) I am planning to convert each pandas dataframe into dask and then append them all to one big dask dataframe to analyze the whole data. I think the MultiIndex thing is the only issue here. Please let me know if I am going the wrong way about this.

Currently Dask DataFrame does not support dataframes with MultiIndexes.

You might consider converting all but one of your index columns into normal columns with reset_index .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM