简体   繁体   中英

Pandas groupby transform

Need a confirmation regarding behaviors of Pandas Groupby transform:

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                      'foo', 'bar'],
               'B' : ['one', 'one', 'two', 'three',
                      'two', 'two'],
               'C' : [1, 5, 5, 2, 5, 5],
               'D' : [2.0, 5., 8., 1., 2., 9.]})
grouped = df.groupby('A')
grouped.transform(lambda x: (x - x.mean()) / x.std())

          C         D
0 -1.154701 -0.577350
1  0.577350  0.000000
2  0.577350  1.154701
3 -1.154701 -1.000000
4  0.577350 -0.577350
5  0.577350  1.000000

It does not specify which column to apply the lambda function. how pandas decide which columns (in this case, C and D) to apply the function? why did it not apply to column B and throw an error?

why the output does not include column A and B?

GroupBy.transform calls the specified function for each column in each group (so B , C , and D - not A because that's what you're grouping by). However, the functions you're calling ( mean and std ) only work with numeric values, so Pandas skips the column if it's dtype is not numeric. String columns are of dtype object , which isn't numeric, so B gets dropped, and you're left with C and D .

You should have got warning when you ran your code—

FutureWarning: Dropping invalid columns in DataFrameGroupBy.transform is deprecated. In a future version, a TypeError will be raised. Before calling .transform, select only columns which should be valid for the transforming function.

As it indicates, you need to select the columns you want to process prior to processing in order to evade the warning. You can do that by added [['C', 'D']] (to select, for example, your C and D columns) before you call transform :

grouped[['C', 'D']].transform(lambda x: (x - x.mean()) / x.std())
#      ^^^^^^^^^^^^ important

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM