在满足条件的SeriesGroupBy对象上使用Apply

Question

I have a DataFrame df1 : 我有一个DataFrame df1 ：

 df1.head() = 

           id      ret     eff
    1469  2300 -0.010879  4480.0
    328   2300 -0.000692 -4074.0
    1376  2300 -0.009551  4350.0
    2110  2300 -0.014013  5335.0
    849   2300 -0.286490 -9460.0

I would like to create a new column that contains the normalized values of the column df1['eff'] . 我想创建一个新列，其中包含列df1['eff']的规范化值。
In other words, I would like to group df1['eff'] by df1['id'] , look for the max value ( mx = df1['eff'].max() ) and the min value ( mn = df2['eff'].min() ), and divide in a pairwise fashion each value of the column df1['eff'] by mn or mx depending if df1['eff'] > 0 or df1['eff']< 0 . 换句话说，我想将df1['eff']与df1['id']分组，寻找最大值（ mx = df1['eff'].max() ）和最小值（ mn = df2['eff'].min() ），并以成对方式将df1['eff']列的每个值除以mn或mx具体取决于df1['eff'] > 0还是df1['eff']< 0 。

The code that I have written is the following: 我编写的代码如下：

df1['normd'] = df1.groupby('id')['eff'].apply(lambda x: x/x.max() if x > 0 else x/x.min())

However python throws the following error: 但是python抛出以下错误：

*** ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(),
 a.item(), a.any() or a.all().

Since df1.groupby('id')['eff'] is a SeriesGroupBy Object , i decided to use map() . 由于df1.groupby('id')['eff']是SeriesGroupBy Object ，因此我决定使用map() 。 But again python throws the following error: 但是python再次抛出以下错误：

 *** AttributeError: Cannot access callable attribute 'map' of 'SeriesGroupBy' ob
 jects, try using the 'apply' method

Many thanks in advance. 提前谢谢了。

Answer 1

You can use custom function f , where is possible easy add print . 您可以使用自定义功能f ，可以在其中轻松添加print 。 So x is Series and you need compare each group by numpy.where . 所以x是Series ，您需要通过numpy.where比较每个组。 Output is numpy array and you need convert it to Series : 输出是numpy array ，您需要将其转换为Series ：

def f(x):
    #print (x)
    #print (x/x.max())
    #print (x/x.min())
    return pd.Series(np.where(x>0, x/x.max(), x/x.min()), index=x.index)


df1['normd'] = df1.groupby('id')['eff'].apply(f)
print (df1)
        id       ret     eff     normd
1469  2300 -0.010879  4480.0  0.839738
328   2300 -0.000692 -4074.0  0.430655
1376  2300 -0.009551  4350.0  0.815370
2110  2300 -0.014013  5335.0  1.000000
849   2300 -0.286490 -9460.0  1.000000

What is same as: 等同于：

df1['normd'] = df1.groupby('id')['eff']
                  .apply(lambda x: pd.Series(np.where(x>0, 
                                                      x/x.max(), 
                                                      x/x.min()), index=x.index))
print (df1)
        id       ret     eff     normd
1469  2300 -0.010879  4480.0  0.839738
328   2300 -0.000692 -4074.0  0.430655
1376  2300 -0.009551  4350.0  0.815370
2110  2300 -0.014013  5335.0  1.000000
849   2300 -0.286490 -9460.0  1.000000

在满足条件的SeriesGroupBy对象上使用Apply

问题描述

1 个解决方案

解决方案1
3 已采纳 2016-07-01 10:30:22

在满足条件的SeriesGroupBy对象上使用Apply

问题描述

1 个解决方案

解决方案1 3 已采纳 2016-07-01 10:30:22

解决方案1
3 已采纳 2016-07-01 10:30:22