[英]How to pivot table in pandas on multiple columns?
+----+----------------+----------+---------------------+
| id | characteristic | location | value |
+----+----------------+----------+---------------------+
| 1 | start | loc1 | 01/01/2020 00:00:00 |
+----+----------------+----------+---------------------+
| 1 | end | loc1 | 01/01/2020 00:00:20 |
+----+----------------+----------+---------------------+
| 1 | start | loc2 | 01/01/2020 00:00:20 |
+----+----------------+----------+---------------------+
| 1 | end | loc2 | 01/01/2020 00:00:40 |
+----+----------------+----------+---------------------+
| 2 | start | loc1 | 01/01/2020 00:00:40 |
+----+----------------+----------+---------------------+
| 2 | end | loc1 | 01/01/2020 00:01:00 |
+----+----------------+----------+---------------------+
我有上面的表格,我想把它转换成下面的样子
+----+---------------------+---------------------+----------+
| id | start | end | location |
+----+---------------------+---------------------+----------+
| 1 | 01/01/2020 00:00:00 | 01/01/2020 00:00:20 | loc1 |
+----+---------------------+---------------------+----------+
| 1 | 01/01/2020 00:00:20 | 01/01/2020 00:00:40 | loc2 |
+----+---------------------+---------------------+----------+
| 2 | 01/01/2020 00:00:40 | 01/01/2020 00:01:00 | loc1 |
+----+---------------------+---------------------+----------+
请告知您将如何解决此问题。 谢谢!!!
您可以使用pivot_table
函数。
pd.pivot_table(df, values='value', index =['id', 'location'] ,columns=['characteristic'], aggfunc='first')
我们需要使用cumcount
创建帮助键,那么这应该是pivot问题
df['helpkey']=df.groupby(['id','characteristic']).cumcount()
s=df.set_index(['id','location','helpkey','characteristic'])['value'].unstack(level=3).reset_index().drop('helpkey',1)
s
characteristic id location end start
0 1 loc1 01/01/2020 00:00:20 01/01/2020 00:00:00
1 1 loc2 01/01/2020 00:00:40 01/01/2020 00:00:20
2 2 loc1 01/01/2020 00:01:00 01/01/2020 00:00:40
您可以使用groupby
和unstack
来解决这个问题:
sum
方法在这里是因为我们需要在groupby
和unstack
之间做一些事情
df.groupby(['id','location','characteristic')['value']\
.sum()\ # other aggregation methods such as min, max could also work here
.unstack('characteristic')\ # will create one col by characteristic value and group rows
.reset_index()
# id location end start
#0 1 loc1 2 1
#1 1 loc2 4 3
#2 2 loc1 6 5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.