[英]add group property in pandas within a chain (analogous to dplyr group_by - mutate in R)
I would like to add some group properties as a new column to a pandas dataframe but without beaking the chain.我想将一些组属性作为新列添加到 pandas dataframe 但不破坏链。 I know this is possible in R using dplyr but I cannot get it to work in pandas.我知道这在 R 中使用 dplyr 是可能的,但我无法让它在 pandas 中工作。
The dplyr code would be (for adding max of column B per group in column A): dplyr 代码将是(用于在 A 列中每组添加最大 B 列):
df %>%
group_by(A) %>%
mutate(max = max(B)) %>%
ungroup() %>%
... more operations
The only way I can get it to work in pandas is:我可以让它在 pandas 中工作的唯一方法是:
df['max'] = df.groupby('A')['B'].transform('max')
but this requires a seperate line to assign the new column while I would like to do it inside a chain.但这需要单独的行来分配新列,而我想在链中执行它。 Any help would be appreciated.任何帮助,将不胜感激。
df.assign(max=df.groupby('A')['B'].transform('max'))....more operations
Now you are able to do it smoothly with datar
现在您可以使用datar
顺利完成
from datar import f
from datar.base import max
from datar.dplyr import group_by, mutate, ungroup
df >> \
group_by(f.A) >> \
mutate(max = max(f.B)) >> \
ungroup() # >>
# ... more operations
I am the author of the package.我是 package 的作者。 Feel free to submit issues if you have any questions.如果您有任何问题,请随时提交问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.