如何比較 pandas 數據框列中分隔字符串中的每個元素與 python 列表 object 元素

Question

我有一個數據框，它有一個分隔字符串列，必須與列表進行比較。 如果分隔字符串中元素的結果與列表元素相交，則考慮該行。

例如

test_lst = [20, 45, 35]
data = pd.DataFrame({'colA': [1, 2, 3],
          'colB': ['20,45,50,60', '22,70,35', '10,90,100']})

應該有 output 因為元素 20,45 在第一行的 DF 中的列表變量和分隔文本中都很常見。

同樣，第 2 行有 35 個相交

可樂	colB
1個	20,45,50,60
2個	22,70,35

我試過的是

test_lst = [20, 45, 35]
data["colC"]= data['colB'].str.split(',')
data

# data["colC"].apply(lambda x: set(x).intersection(test_lst))
print(data[data['colC'].apply(lambda x: set(x).intersection(test_lst)).astype(bool)])
data

沒有給出所需的結果。

任何幫助表示贊賞

Answer 1

這可能不是最好的方法，但它確實有效。

import pandas as pd

df = pd.DataFrame({'colA': [1, 2, 3],
          'colB': ['20,45,50,60', '22,70,35', '10,90,100']}) 

def match_element(row):
    row_elements = [int(n) for n in row.split(',')]
    test_lst = [20, 45, 35]
    
    if [value for value in row_elements if value in test_lst]:
        return True
    else:
        return False

mask = df['colB'].apply(lambda row: match_element(row))
df = df[mask]

output：

	可樂	colB
0	1個	20,45,50,60
1個	2個	22,70,35

如何比較 pandas 數據框列中分隔字符串中的每個元素與 python 列表 object 元素

問題描述

1 個解決方案

解決方案1
0 2022-02-18 06:55:16

如何比較 pandas 數據框列中分隔字符串中的每個元素與 python 列表 object 元素

問題描述

1 個解決方案

解決方案1 0 2022-02-18 06:55:16

解決方案1
0 2022-02-18 06:55:16