如何返回 DataFrame 的行，其中每個大陸的每個國家/地區的人口都少於 100？

Question

df = pd.DataFrame({
    "Continent": list("AAABBBCCD"), 
    "Country": list("FGHIJKLMN"), 
    "Population": [90, 140, 50, 80, 80, 70, 50, 125, 50]})

如前所述，我想返回所有行，其中每個大陸的所有國家都小於 100。

  Continent Country  Population
0         A       F          90
1         A       G         140
2         A       H          50
3         B       I          80
4         B       J          80
5         B       K          70
6         C       L          50
7         C       M         125
8         D       N          50

大陸 A 中的每一行都被刪除，因為國家 G 的人口大於 100。由於國家 M，大陸 C 中的每一行都被刪除。我希望返回的 DataFrame 如下所示：

  Continent Country  Population
3         B       I          80
4         B       J          80
5         B       K          70
8         D       N          50

我嘗試了df[df["Population"] <= 100]但無法確定如何針對大陸進行調整。

Answer 1

這是一種方法

# groupby on continent
# using makes the row True/False, whether max for the group is below 100
out=df[df.groupby(['Continent'])['Population'].transform(lambda x: x.max()<100)]
out

Continent   Country     Population
3   B   I   80
4   B   J   80
5   B   K   70
8   D   N   50

Answer 2

這是實現它的另一種方法

import pandas as pd

df = pd.DataFrame({
    "Continent": list("AAABBBCCD"), 
    "Country": list("FGHIJKLMN"), 
    "Population": [90, 140, 50, 80, 80, 70, 50, 125, 50]})

df.loc[df.groupby(['Continent'])['Population'].transform('max') <= 100]

我通常不喜歡使用 lambda 因為它太慢了，但上面的答案也有效。 這只是另一種選擇

如何返回 DataFrame 的行，其中每個大陸的每個國家/地區的人口都少於 100？

問題描述

2 個解決方案

解決方案1
0 2022-11-21 01:02:50

解決方案2
0 2022-11-21 02:20:26

如何返回 DataFrame 的行，其中每個大陸的每個國家/地區的人口都少於 100？

問題描述

2 個解決方案

解決方案1 0 2022-11-21 01:02:50

解決方案2 0 2022-11-21 02:20:26

解決方案1
0 2022-11-21 01:02:50

解決方案2
0 2022-11-21 02:20:26