[英]Shifting columns in grouped pandas dataframe
I have a dataframe which, after grouping it by country
and group
looks like this:我有一个数据框,按
country
和group
分组后如下所示:
A B C D
country group
1 a1 10 20 30 40
a2 11 21 31 41
a3 12 22 32 42
a4 13 23 33 43
A B C D
country group
2 a1 50 60 70 80
a2 51 61 71 81
a3 52 62 72 82
a4 53 63 73 83
My goal is to create another column E
that would hold column D
values shifted up by 1 row like so:我的目标是创建另一列
E
将D
列值向上移动 1 行,如下所示:
A B C D E
country group
1 a1 10 20 30 40 41
a2 11 21 31 41 42
a3 12 22 32 42 43
a4 13 23 33 43 nan
A B C D E
country group
2 a1 50 60 70 80 81
a2 51 61 71 81 82
a3 52 62 72 82 83
a4 53 63 73 83 nan
What I've tried:我试过的:
df.groupby(['country','group']).sum().apply(lambda x['E']: x['D'].shift(-1))
but I get invalid syntax. df.groupby(['country','group']).sum().apply(lambda x['E']: x['D'].shift(-1))
但我得到无效的语法。
Afterwards I am trying to delete those bottom lines in each group where nan
is present like so: df = df[~df.isin([np.nan]).any(1)]
which works.之后,我试图删除存在
nan
每个组中的那些底线,如下所示: df = df[~df.isin([np.nan]).any(1)]
有效。
How can I add a column E
to the df
which would hold column D
values shifted by -1
?如何将
E
列添加到df
,该列将保留D
列值偏移-1
?
Use DataFrameGroupBy.shift
by first level:按第一级使用
DataFrameGroupBy.shift
:
df = df.groupby(['country','group']).sum()
df['E'] = df.groupby(level=0)['D'].shift(-1)
And then DataFrame.dropna
:然后是
DataFrame.dropna
:
df = df.dropna(subset=['E'])
Sample :样品:
print (df)
country group A B C D
0 1 a1 10 20 30 40
1 1 a1 11 21 31 41
2 1 a1 12 22 32 42
3 1 a2 13 23 33 43
4 1 a2 11 21 31 41
5 1 a2 12 22 32 42
6 1 a3 13 23 33 43
7 1 a3 11 21 31 41
8 1 a3 12 22 32 42
9 1 a4 13 23 33 43
10 1 a4 11 21 31 41
11 1 a5 12 22 32 42
12 1 a5 13 23 33 43
13 2 a2 50 60 70 80
14 2 a3 51 61 71 81
15 2 a4 52 62 72 82
16 2 a5 53 63 73 83
df = df.groupby(['country','group']).sum()
print (df)
A B C D
country group
1 a1 33 63 93 123
a2 36 66 96 126
a3 36 66 96 126
a4 24 44 64 84
a5 25 45 65 85
2 a2 50 60 70 80
a3 51 61 71 81
a4 52 62 72 82
a5 53 63 73 83
df['E'] = df.groupby(level=0)['D'].shift(-1)
print (df)
A B C D E
country group
1 a1 33 63 93 123 126.0
a2 36 66 96 126 126.0
a3 36 66 96 126 84.0
a4 24 44 64 84 85.0
a5 25 45 65 85 NaN
2 a2 50 60 70 80 81.0
a3 51 61 71 81 82.0
a4 52 62 72 82 83.0
a5 53 63 73 83 NaN
df = df.dropna(subset=['E'])
print (df)
A B C D E
country group
1 a1 33 63 93 123 126.0
a2 36 66 96 126 126.0
a3 36 66 96 126 84.0
a4 24 44 64 84 85.0
2 a2 50 60 70 80 81.0
a3 51 61 71 81 82.0
a4 52 62 72 82 83.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.