如何在多列的熊猫中透视表？

Question

+----+----------------+----------+---------------------+
| id | characteristic | location | value               |
+----+----------------+----------+---------------------+
| 1  | start          | loc1     | 01/01/2020 00:00:00 |
+----+----------------+----------+---------------------+
| 1  | end            | loc1     | 01/01/2020 00:00:20 |
+----+----------------+----------+---------------------+
| 1  | start          | loc2     | 01/01/2020 00:00:20 |
+----+----------------+----------+---------------------+
| 1  | end            | loc2     | 01/01/2020 00:00:40 |
+----+----------------+----------+---------------------+
| 2  | start          | loc1     | 01/01/2020 00:00:40 |
+----+----------------+----------+---------------------+
| 2  | end            | loc1     | 01/01/2020 00:01:00 |
+----+----------------+----------+---------------------+

我有上面的表格，我想把它转换成下面的样子

+----+---------------------+---------------------+----------+
| id | start               | end                 | location |
+----+---------------------+---------------------+----------+
| 1  | 01/01/2020 00:00:00 | 01/01/2020 00:00:20 | loc1     |
+----+---------------------+---------------------+----------+
| 1  | 01/01/2020 00:00:20 | 01/01/2020 00:00:40 | loc2     |
+----+---------------------+---------------------+----------+
| 2  | 01/01/2020 00:00:40 | 01/01/2020 00:01:00 | loc1     |
+----+---------------------+---------------------+----------+

请告知您将如何解决此问题。 谢谢！！！

Answer 1

您可以使用pivot_table函数。

pd.pivot_table(df, values='value', index =['id', 'location'] ,columns=['characteristic'], aggfunc='first')

Answer 2

我们需要使用cumcount创建帮助键，那么这应该是pivot问题

df['helpkey']=df.groupby(['id','characteristic']).cumcount()


s=df.set_index(['id','location','helpkey','characteristic'])['value'].unstack(level=3).reset_index().drop('helpkey',1)

s
characteristic  id    location        end                    start          
0                1   loc1        01/01/2020 00:00:20    01/01/2020 00:00:00 
1                1   loc2        01/01/2020 00:00:40    01/01/2020 00:00:20 
2                2   loc1        01/01/2020 00:01:00    01/01/2020 00:00:40

Answer 3

您可以使用groupby和unstack来解决这个问题：

sum方法在这里是因为我们需要在groupby和unstack之间做一些事情

df.groupby(['id','location','characteristic')['value']\
   .sum()\ # other aggregation methods such as min, max could also work here
   .unstack('characteristic')\ # will create one col by characteristic value and group rows
   .reset_index()
#   id  location    end start
#0  1   loc1    2   1
#1  1   loc2    4   3
#2  2   loc1    6   5

如何在多列的熊猫中透视表？

问题描述

3 个解决方案

解决方案1
3 2020-03-11 02:25:49

解决方案2
2 已采纳 2020-03-11 02:15:06

解决方案3
1 2020-03-11 02:18:54

如何在多列的熊猫中透视表？

问题描述

3 个解决方案

解决方案1 3 2020-03-11 02:25:49

解决方案2 2 已采纳 2020-03-11 02:15:06

解决方案3 1 2020-03-11 02:18:54

解决方案1
3 2020-03-11 02:25:49

解决方案2
2 已采纳 2020-03-11 02:15:06

解决方案3
1 2020-03-11 02:18:54