[英]Python pandas - apply function to grouped dataframe
I have a dataframe as follows: 我有一个数据框,如下所示:
A B C
0 foo 1.496337 -0.604264
1 bar -0.025106 0.257354
2 foo 0.958001 0.933328
3 foo -1.126581 0.570908
4 bar -0.428304 0.881995
5 foo -0.955252 1.408930
6 bar 0.504582 0.455287
7 bar -1.076096 0.536741
8 bar 0.351544 -1.146554
9 foo 0.430260 -0.348472
I would like to get the max of column B
of each group (when grouped by A
) and add it the the column C
. 我想获得每个组的
B
列的最大值(按A
分组时),并将其添加到C
列。 So here is what I tried: 所以这是我尝试的:
Group by A
: 按
A
分组:
df = df.groupby(by='A')
Get the maximum of column B
and then tried to apply it to column 'C': 获取列
B
的最大值,然后尝试将其应用于列“ C”:
for name in ['foo','bar']:
maxi = df.get_group(name)['B'].max()
df.get_group(name)['C'] = df.get_group(name)['C']+maxi
At this point pandas suggests Try using .loc[row_indexer,col_indexer] = value instead
. 此时,熊猫建议
Try using .loc[row_indexer,col_indexer] = value instead
。 Does this mean I have to use for
loops on rows with a if
on the column A
value and modify the C
data one by one? 这是否意味着我必须对
A
列A
值使用if
行for
循环,并逐个修改C
数据? I mean that does not seem to be pandas-ish and I feel that I am missing something. 我的意思是那似乎不是熊猫般的,我觉得我缺少了一些东西。 How could I better work around this grouped dataframe?
我如何更好地解决这个分组数据框?
Such operations are done using transforms or aggregations. 使用转换或聚合来完成此类操作。 In your case you need
transform
在您的情况下,您需要
transform
# groupby 'A'
grouped = df.groupby('A')
# transform B so every row becomes the maximum along the group:
max_B = grouped['B'].transform('max')
# add the new column to the old df
df['D'] = df['A'] + max_B
Or in one line: 或一行:
In [2]: df['D'] = df.groupby('A')['B'].transform('max') + df['C']
In [3]: df
Out[3]:
A B C D
0 foo 1.496337 -0.604264 0.892073
1 bar -0.025106 0.257354 0.761936
2 foo 0.958001 0.933328 2.429665
3 foo -1.126581 0.570908 2.067245
4 bar -0.428304 0.881995 1.386577
5 foo -0.955252 1.408930 2.905267
6 bar 0.504582 0.455287 0.959869
7 bar -1.076096 0.536741 1.041323
8 bar 0.351544 -1.146554 -0.641972
9 foo 0.430260 -0.348472 1.147865
For more info, see http://pandas.pydata.org/pandas-docs/stable/groupby.html 有关更多信息,请参见http://pandas.pydata.org/pandas-docs/stable/groupby.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.