简体   繁体   English

扩展Pandas数据帧,类似于Pivot或Stack / Unstack

[英]Widening Pandas Data Frame, Similar to Pivot or Stack/Unstack

My problem is probably best explained with an example: 我的问题最好用一个例子来解释:

What I have: 我有的:

ID0,ID1,Time,Data0,Data1
 1   1   10  'A'    93
 1   2   10  'A'    55
 1   1   12  'A'    88
 1   2   12  'B'    66
 2   3   102 'C'    14
 2   4   102 'A'    22
 2   4   112 'D'    15
 2   3   112 'B'    43

What I would like: 我想要的是什么:

ID0,ID1,Time,Data0,Data1,Data0.2,Data1.2
 1   1   10   'A'    93    'A'    55
 1   2   10   'A'    55    'A'    93
 1   1   12   'A'    88    'B'    66
 1   2   12   'B'    66    'A'    88
 2   3   102  'C'    14    'A'    22
 2   4   102  'A'    22    'C'    14
 2   4   112  'D'    15    'B'    43
 2   3   112  'B'    43    'D'    15

Essentially, there are 2 unique ID1s associated with every ID0. 基本上,每个ID0都有2个唯一的ID1。

Data is sampled periodically. 数据定期采样。

I would like to make the original data frame 'wider' by adding more columns so that each row contains information from the other ID1 from the same time period. 我想通过添加更多列使原始数据框“更宽”,以便每行包含来自同一时间段内其他ID1的信息。

try: 尝试:

grb = df.groupby(['ID0', 'Time'])
df['Data0.2'] = grb['Data0'].transform(lambda ts: ts[::-1])
df['Data1.2'] = grb['Data1'].transform(lambda ts: ts[::-1])

what is this basically doing is, based on your statement that 这基本上是做什么的,基于你的陈述

there are 2 unique ID1s associated with every ID0. 每个ID0都有2个唯一的ID1。

it groups the data-frame by ['ID0', 'Time'] and reverses the specific columns; 它按['ID0', 'Time']对数据帧进行分组['ID0', 'Time']并反转特定的列; if there are exactly 2 unique ID1s in each group, the data-frame will be expanded by the values from the other ID1 ; 如果ID1s中只有2个唯一的ID1s ,则数据帧将被另一个ID1的值扩展;

>>> df
   ID0  ID1  Time Data0  Data1 Data0.2  Data1.2
0    1    1    10   'A'     93     'A'       55
1    1    2    10   'A'     55     'A'       93
2    1    1    12   'A'     88     'B'       66
3    1    2    12   'B'     66     'A'       88
4    2    3   102   'C'     14     'A'       22
5    2    4   102   'A'     22     'C'       14
6    2    4   112   'D'     15     'B'       43
7    2    3   112   'B'     43     'D'       15

[8 rows x 7 columns]

final edit : to do both columns together, you may try below; 最终编辑 :要将两个列放在一起,您可以尝试下面; note that .values is necessary in here: 请注意.values在这里是必要的:

>>> grb = df.groupby(['ID0', 'Time'])
>>> df2 = grb['Data0', 'Data1'].transform(lambda obj: obj.values[::-1])
>>> df.join(df2, rsuffix='.2')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM