How to separate index in pivot table in python?

Question

I know there are 3 primary parameters in pivot_table. Index, columns and fill_value.

df = pd.pivot_table(df,index='userID',columns='days',fill_value=0)    # Fill 0

I can't pivot my dataframe because of the memory problem.

So is it possible to split the index to small parts then merge those pivot tables together to solve this problem?

For example, userID was in range(0,1000000), I want to cut them to 3 parts:(0,333333),(333333,666666)and (666666,1000000). Then combine these 3 into one pivot table.

Answer 1

Yes, you can do something like this:

df_out = pd.concat([df.query('UserID < @i').pivot_table(index='UserID', 
                   columns='days', fill_value=0) for i in [333333,666666,1000000]])

Answer 2

By using np.array_split

pd.concat([x.pivot_table(index='UserID',\
        columns='days', fill_value=0) for x in np.array_split(df, 3)])

How to separate index in pivot table in python?

Question

2 answers

solution1
0 ACCPTED 2017-11-07 18:32:54

solution2
0 2017-11-07 18:36:07

How to separate index in pivot table in python?

Question

2 answers

solution1 0 ACCPTED 2017-11-07 18:32:54

solution2 0 2017-11-07 18:36:07

solution1
0 ACCPTED 2017-11-07 18:32:54

solution2
0 2017-11-07 18:36:07