[英]crosstab in Pandas DataFrame
I created a DataFrame 我创建了一个DataFrame
A1 A2 A3 A4
0 cccc xx 6 5
1 aaaa yy 8 0
2 aaaa xx 15 0
3 bbbb xx 21 4
4 bbbb xx 26 0
5 cccc yy 33 2
6 aaaa xx 44 1
7 cccc xx 48 2
8 aaaa yy 58 0
9 cccc yy 59 5
10 bbbb yy 77 0
11 bbbb yy 99 0
and now using crosstab()
with the command given below I was created new DataFrame. 现在使用crosstab()
和下面给出的命令创建了新的DataFrame。
df5 = pd.crosstab(df4['A1'], df4['A2'], margins=False,values=df4['A3'] ,
dropna=False, aggfunc='mean').reset_index().fillna(0)
this works properl. 这工作正常。 it gives me output as follows 它给我的输出如下
A2 A1 xx yy
0 aaaa 29.5 33.0
1 bbbb 23.5 88.0
2 cccc 27.0 46.0
Now I want to store the mean values into the DataFrame df4
现在我要将平均值存储到DataFrame df4
How can I do it, since I want to change A3
which contain 0 in df5
based on the crosstab()
? 由于我想基于crosstab()
更改df5
中包含0的A3
,该怎么办? and I want output as follows 我想要输出如下
A1 A2 A3 A4
0 aaaa xx 15 29.5
1 aaaa xx 44 1.0
2 aaaa yy 8 33.0
3 aaaa yy 58 33.0
4 bbbb xx 21 4.0
5 bbbb xx 26 23.5
6 bbbb yy 77 88.0
7 bbbb yy 99 88.0
8 cccc xx 6 5.0
9 cccc xx 48 2.0
mask
+ groupby
+ transform
mask
+ groupby
+ transform
Ignoring the unnecessary reordering and removal of some rows in your desired output, you can use mask
with groupby
: 忽略不必要的重新排序和删除所需输出中的某些行,可以将mask
与groupby
一起使用:
group_mean = df4.groupby(['A1', 'A2'])['A3'].transform('mean')
df4['A4'] = df4['A4'].mask(df4['A4'] == 0, group_mean)
print(df4)
A1 A2 A3 A4
0 cccc xx 6 5.0
1 aaaa yy 8 33.0
2 aaaa xx 15 29.5
3 bbbb xx 21 4.0
4 bbbb xx 26 23.5
5 cccc yy 33 2.0
6 aaaa xx 44 1.0
7 cccc xx 48 2.0
8 aaaa yy 58 33.0
9 cccc yy 59 5.0
10 bbbb yy 77 88.0
11 bbbb yy 99 88.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.