How to compare the each elements in the delimited string in pandas data frame column with a python list object elements

Question

I have a data frame that has a delimited string column that has to be compared with a list. If the result of the elements in the delimited string and elements of the list intersect, consider that row.

For example

test_lst = [20, 45, 35]
data = pd.DataFrame({'colA': [1, 2, 3],
          'colB': ['20,45,50,60', '22,70,35', '10,90,100']})

should have the output as because the elements 20,45 are common in both the list variable and delimited text in DF in the first row.

Likewise, 35 intersects in row 2

colA	colB
1	20,45,50,60
2	22,70,35

What I have tried is

test_lst = [20, 45, 35]
data["colC"]= data['colB'].str.split(',')
data

# data["colC"].apply(lambda x: set(x).intersection(test_lst))
print(data[data['colC'].apply(lambda x: set(x).intersection(test_lst)).astype(bool)])
data

Does not give the required result.

Any help is appreciated

Answer 1

This might not be the best approach, but it works.

import pandas as pd

df = pd.DataFrame({'colA': [1, 2, 3],
          'colB': ['20,45,50,60', '22,70,35', '10,90,100']}) 

def match_element(row):
    row_elements = [int(n) for n in row.split(',')]
    test_lst = [20, 45, 35]
    
    if [value for value in row_elements if value in test_lst]:
        return True
    else:
        return False

mask = df['colB'].apply(lambda row: match_element(row))
df = df[mask]

output:

	colA	colB
0	1	20,45,50,60
1	2	22,70,35

How to compare the each elements in the delimited string in pandas data frame column with a python list object elements

Question

1 answers

solution1
0 2022-02-18 06:55:16

How to compare the each elements in the delimited string in pandas data frame column with a python list object elements

Question

1 answers

solution1 0 2022-02-18 06:55:16

solution1
0 2022-02-18 06:55:16