简体   繁体   中英

How to separate index in pivot table in python?

I know there are 3 primary parameters in pivot_table. Index, columns and fill_value.

df = pd.pivot_table(df,index='userID',columns='days',fill_value=0)    # Fill 0

I can't pivot my dataframe because of the memory problem.

So is it possible to split the index to small parts then merge those pivot tables together to solve this problem?

For example, userID was in range(0,1000000), I want to cut them to 3 parts:(0,333333),(333333,666666)and (666666,1000000). Then combine these 3 into one pivot table.

Yes, you can do something like this:

df_out = pd.concat([df.query('UserID < @i').pivot_table(index='UserID', 
                   columns='days', fill_value=0) for i in [333333,666666,1000000]])

By using np.array_split

pd.concat([x.pivot_table(index='UserID',\
        columns='days', fill_value=0) for x in np.array_split(df, 3)])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM