简体   繁体   中英

pandas pivot dataframe with multiple groupby

I have a pandas dataframe with data like this:

df:

     item   day         time      data  
0   item_0  2012-12-02  00:00:01  0.81  
1   item_0  2012-12-02  00:00:02  0.07
2   item_0  2012-12-03  00:00:00  0.84  
3   item_1  2012-12-02  00:00:01  0.47

The combination of item+day+time are unique

I am trying to transform to:

     item   day         time_0    time_1   time_2  
0   item_0  2012-12-02  NaN       0.81     0.07
1   item_0  2012-12-03  0.84      NaN      NaN  
2   item_1  2012-12-02  NaN       0.47     ... 

I have tried:

df_stage_1 = df.groupby(['item','day']).apply(lambda x: x['time'].tolist()).reset_index()

the code above produces a list but times are not aligned from 00:00:00 - I could just check the list and add and track the indexes (so can add Nan to value list at these indexes)

df_stage_1 = pd.DataFrame(df_stage_1.tolist(), )

the code above gives me a dataframe of (unaligned) time values, which I could align (see above) and append to dataframe created in step above, but I cant work out how to get values for dataframe in correct time aligned columns

You can use pd.pivot_table :

res = df.pivot_table(index=['item', 'day'], columns='time',
                     values='data', aggfunc='first').reset_index()

print(res)

time    item         day  00:00:00  00:00:01  00:00:02
0     item_0  2012-12-02       NaN      0.81      0.07
1     item_0  2012-12-03      0.84       NaN       NaN
2     item_1  2012-12-02       NaN      0.47       NaN

Another solution is set_index , unstack , reset_index :

df.set_index(['item', 'day', 'time'])['data'].unstack().reset_index()

time    item         day  00:00:00  00:00:01  00:00:02
0     item_0  2012-12-02       NaN      0.81      0.07
1     item_0  2012-12-03      0.84       NaN       NaN
2     item_1  2012-12-02       NaN      0.47       NaN

Remember that df.unstack in pandas refers to the index: it unstacks the innermost level of the index and pivots it into the columns.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM