Pandas - 如何按一个数字列分组并按每组的中位数过滤每组的行？

Question

I have a dataset consisting of one ID, one categorical variable "A" and one numerical variable "B".我有一个由一个 ID、一个分类变量“A”和一个数值变量“B”组成的数据集。
I want to group by "A" and filter the rows from each group to get only the rows that are avobe or equal to the median of "B" (the median should be calculated for each group).我想按“A”分组并过滤每个组中的行，以仅获取 avobe 或等于“B”中位数的行（应为每个组计算中位数）。
Example:例子：

ID ID	A一个	B乙
1 1	Category 1第一类	0.5 0.5
2 2	Category 2第 2 类	0.2 0.2
3 3	Category 1第一类	0.2 0.2
4 4	Category 1第一类	0.6 0.6
5 5	Category 2第 2 类	0.4 0.4

My expected result would be:我的预期结果是：

ID ID	A一个	B乙
1 1	Category 1第一类	0.5 0.5
4 4	Category 1第一类	0.6 0.6
5 5	Category 2第 2 类	0.4 0.4

Being the median of category 1 = 0.5 and 0.3 for category 2.作为类别 1 的中位数 = 0.5 和类别 2 的 0.3。
Thank you!谢谢！

Answer 1

out = df[df.groupby("A")["B"].transform(lambda x: x >= x.median())]
print(out)

Prints:印刷：

   ID           A    B
0   1  Category 1  0.5
3   4  Category 1  0.6
4   5  Category 2  0.4

Pandas - 如何按一个数字列分组并按每组的中位数过滤每组的行？

问题描述

1 个解决方案

解决方案1
4 2021-05-06 22:04:19

Pandas - 如何按一个数字列分组并按每组的中位数过滤每组的行？

问题描述

1 个解决方案

解决方案1 4 2021-05-06 22:04:19

解决方案1
4 2021-05-06 22:04:19