简体   繁体   中英

Groupby, pivot and return all columns in a pandas dataframe

I have a pandas dataframe which looks like this:

    col1    col2    col3    col4    col5    status  count
0   AA      PP      X       Y       13.1764     0   1.0
1   AA      PP      X       Y       12.145      0   1.0
2   AA      PP      X       Y       13.17       0   2.0
3   AA      PP      X       Y       23.5        0   2.0
4   AA      PP      X       Y       1100.4      0   2.0
5   AA      PP      X       Y       20.5        0   3.0
6   AA      PP      X       Y       1300.0      0   3.0
...

What I am trying to do?

  1. Group by col1
  2. Then group by count
  3. Flatten the col5 values and append to everything else

The final dataframe should look like this:

AA         col2 col3 col4 status count1 count2 count3
count  1.0  PP  X    Y     0     13.1764 12.145 NA 
       2.0  PP  X    Y     0     13.17   23.5   1100.4  
       3.0  PP  X    Y     0     20.5    1300.0 NA  

I have seen a lot of groupyby and pivot questions and trust me I have tried a lot and wasted an hour but couldn't get it right.

If same values of all columns per groups like in sample data use GroupBy.cumcount with pivot_table :

g = df.groupby('count').cumcount()
df1 = (df.pivot_table(index=['col1','count','col2','col3','col4','status'],
                    columns=g, 
                    values='col5')
         .add_prefix('count')
         .reset_index())
print (df)
  col1  count col2 col3 col4  status   count0    count1  count2
0   AA    1.0   PP    X    Y       0  13.1764    12.145     NaN
1   AA    2.0   PP    X    Y       0  13.1700    23.500  1100.4
2   AA    3.0   PP    X    Y       0  20.5000  1300.000     NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM