简体   繁体   English

Python Pandas-将功能应用于分组的数据框

[英]Python pandas - apply function to grouped dataframe

I have a dataframe as follows: 我有一个数据框,如下所示:

    A       B         C     
0  foo  1.496337 -0.604264  
1  bar -0.025106  0.257354 
2  foo  0.958001  0.933328 
3  foo -1.126581  0.570908
4  bar -0.428304  0.881995 
5  foo -0.955252  1.408930 
6  bar  0.504582  0.455287 
7  bar -1.076096  0.536741 
8  bar  0.351544 -1.146554 
9  foo  0.430260 -0.348472 

I would like to get the max of column B of each group (when grouped by A ) and add it the the column C . 我想获得每个组的B列的最大值(按A分组时),并将其添加到C列。 So here is what I tried: 所以这是我尝试的:

Group by A : A分组:

df = df.groupby(by='A')

Get the maximum of column B and then tried to apply it to column 'C': 获取列B的最大值,然后尝试将其应用于列“ C”:

for name in ['foo','bar']:
    maxi = df.get_group(name)['B'].max()
    df.get_group(name)['C'] = df.get_group(name)['C']+maxi

At this point pandas suggests Try using .loc[row_indexer,col_indexer] = value instead . 此时,熊猫建议Try using .loc[row_indexer,col_indexer] = value instead Does this mean I have to use for loops on rows with a if on the column A value and modify the C data one by one? 这是否意味着我必须对AA值使用iffor循环,并逐个修改C数据? I mean that does not seem to be pandas-ish and I feel that I am missing something. 我的意思是那似乎不是熊猫般的,我觉得我缺少了一些东西。 How could I better work around this grouped dataframe? 我如何更好地解决这个分组数据框?

Such operations are done using transforms or aggregations. 使用转换或聚合来完成此类操作。 In your case you need transform 在您的情况下,您需要transform

# groupby 'A'
grouped = df.groupby('A')

# transform B so every row becomes the maximum along the group:
max_B = grouped['B'].transform('max')

# add the new column to the old df
df['D'] = df['A'] + max_B

Or in one line: 或一行:

In [2]: df['D'] = df.groupby('A')['B'].transform('max') + df['C']

In [3]: df
Out[3]: 
     A         B         C         D
0  foo  1.496337 -0.604264  0.892073
1  bar -0.025106  0.257354  0.761936
2  foo  0.958001  0.933328  2.429665
3  foo -1.126581  0.570908  2.067245
4  bar -0.428304  0.881995  1.386577
5  foo -0.955252  1.408930  2.905267
6  bar  0.504582  0.455287  0.959869
7  bar -1.076096  0.536741  1.041323
8  bar  0.351544 -1.146554 -0.641972
9  foo  0.430260 -0.348472  1.147865

For more info, see http://pandas.pydata.org/pandas-docs/stable/groupby.html 有关更多信息,请参见http://pandas.pydata.org/pandas-docs/stable/groupby.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM