如何返回 DataFrame 的行，其中每个大陆的每个国家/地区的人口都少于 100？

Question

df = pd.DataFrame({
    "Continent": list("AAABBBCCD"), 
    "Country": list("FGHIJKLMN"), 
    "Population": [90, 140, 50, 80, 80, 70, 50, 125, 50]})

如前所述，我想返回所有行，其中每个大陆的所有国家都小于 100。

  Continent Country  Population
0         A       F          90
1         A       G         140
2         A       H          50
3         B       I          80
4         B       J          80
5         B       K          70
6         C       L          50
7         C       M         125
8         D       N          50

大陆 A 中的每一行都被删除，因为国家 G 的人口大于 100。由于国家 M，大陆 C 中的每一行都被删除。我希望返回的 DataFrame 如下所示：

  Continent Country  Population
3         B       I          80
4         B       J          80
5         B       K          70
8         D       N          50

我尝试了df[df["Population"] <= 100]但无法确定如何针对大陆进行调整。

Answer 1

这是一种方法

# groupby on continent
# using makes the row True/False, whether max for the group is below 100
out=df[df.groupby(['Continent'])['Population'].transform(lambda x: x.max()<100)]
out

Continent   Country     Population
3   B   I   80
4   B   J   80
5   B   K   70
8   D   N   50

Answer 2

这是实现它的另一种方法

import pandas as pd

df = pd.DataFrame({
    "Continent": list("AAABBBCCD"), 
    "Country": list("FGHIJKLMN"), 
    "Population": [90, 140, 50, 80, 80, 70, 50, 125, 50]})

df.loc[df.groupby(['Continent'])['Population'].transform('max') <= 100]

我通常不喜欢使用 lambda 因为它太慢了，但上面的答案也有效。 这只是另一种选择

如何返回 DataFrame 的行，其中每个大陆的每个国家/地区的人口都少于 100？

问题描述

2 个解决方案

解决方案1
0 2022-11-21 01:02:50

解决方案2
0 2022-11-21 02:20:26

如何返回 DataFrame 的行，其中每个大陆的每个国家/地区的人口都少于 100？

问题描述

2 个解决方案

解决方案1 0 2022-11-21 01:02:50

解决方案2 0 2022-11-21 02:20:26

解决方案1
0 2022-11-21 01:02:50

解决方案2
0 2022-11-21 02:20:26