簡體   English   中英

在 Pandas 中選擇行,其中一列中的值是另一列中值的子字符串

[英]Select rows in pandas where value in one column is a substring of value in another column

我在下面有一個數據框

>df = pd.DataFrame({'A':['apple','orange','grape','pear','banana'], \
                    'B':['She likes apples', 'I hate oranges', 'This is a random sentence',\
                         'This one too', 'Bananas are yellow']})

>print(df)

    A       B
0   apple   She likes apples
1   orange  I hate oranges
2   grape   This is a random sentence
3   pear    This one too
4   banana  Bananas are yellow

我正在嘗試獲取 B 列包含 A 列中的值的所有行。

預期結果:

    A       B
0   apple   She likes apples
1   orange  I hate oranges
4   banana  Bananas are yellow

我只能使用

>df[df['B'].str.contains(df.iloc[0,0])]

    A       B
0   apple   She likes apples

我怎樣才能獲取所有這些行?

使用DataFrame.apply將兩個值都轉換為較低和測試包含通過in並通過boolean indexing過濾:

df = df[df.apply(lambda x: x.A in x.B.lower(), axis=1)]

或列表理解解決方案:

df = df[[a in b.lower() for a, b in zip(df.A, df.B)]]

print (df)
        A                   B
0   apple    She likes apples
1  orange      I hate oranges
4  banana  Bananas are yellow

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM