简体   繁体   English

如何在多列的熊猫中透视表?

[英]How to pivot table in pandas on multiple columns?

+----+----------------+----------+---------------------+
| id | characteristic | location | value               |
+----+----------------+----------+---------------------+
| 1  | start          | loc1     | 01/01/2020 00:00:00 |
+----+----------------+----------+---------------------+
| 1  | end            | loc1     | 01/01/2020 00:00:20 |
+----+----------------+----------+---------------------+
| 1  | start          | loc2     | 01/01/2020 00:00:20 |
+----+----------------+----------+---------------------+
| 1  | end            | loc2     | 01/01/2020 00:00:40 |
+----+----------------+----------+---------------------+
| 2  | start          | loc1     | 01/01/2020 00:00:40 |
+----+----------------+----------+---------------------+
| 2  | end            | loc1     | 01/01/2020 00:01:00 |
+----+----------------+----------+---------------------+

I have the above table and I would like to convert it to something like below我有上面的表格,我想把它转换成下面的样子

+----+---------------------+---------------------+----------+
| id | start               | end                 | location |
+----+---------------------+---------------------+----------+
| 1  | 01/01/2020 00:00:00 | 01/01/2020 00:00:20 | loc1     |
+----+---------------------+---------------------+----------+
| 1  | 01/01/2020 00:00:20 | 01/01/2020 00:00:40 | loc2     |
+----+---------------------+---------------------+----------+
| 2  | 01/01/2020 00:00:40 | 01/01/2020 00:01:00 | loc1     |
+----+---------------------+---------------------+----------+

Please advise on how would you solve this.请告知您将如何解决此问题。 Thank you!!!谢谢!!!

您可以使用pivot_table函数。

pd.pivot_table(df, values='value', index =['id', 'location'] ,columns=['characteristic'], aggfunc='first')

We need use cumcount create the help key , then this should be pivot problem我们需要使用cumcount创建帮助键,那么这应该是pivot问题

df['helpkey']=df.groupby(['id','characteristic']).cumcount()


s=df.set_index(['id','location','helpkey','characteristic'])['value'].unstack(level=3).reset_index().drop('helpkey',1)

s
characteristic  id    location        end                    start          
0                1   loc1        01/01/2020 00:00:20    01/01/2020 00:00:00 
1                1   loc2        01/01/2020 00:00:40    01/01/2020 00:00:20 
2                2   loc1        01/01/2020 00:01:00    01/01/2020 00:00:40 

You can use groupby and unstack to solve this:您可以使用groupbyunstack来解决这个问题:

The sum method is just here because we need something between the groupby and unstack sum方法在这里是因为我们需要在groupbyunstack之间做一些事情

df.groupby(['id','location','characteristic')['value']\
   .sum()\ # other aggregation methods such as min, max could also work here
   .unstack('characteristic')\ # will create one col by characteristic value and group rows
   .reset_index()
#   id  location    end start
#0  1   loc1    2   1
#1  1   loc2    4   3
#2  2   loc1    6   5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM