简体   繁体   中英

compare two columns having list of strings in pandas

I have a data frame in pandas having two columns where each row is a list of strings, how would it be possible to check if there is word match(es) in these two columns on a unique row(flag column is the desired output)

A                B            flag

hello,hi,bye     bye, also       1
but, as well     see, pandas     0 

I have tried

df['A'].str.contains(df['B'])

but I got this error

TypeError: 'Series' objects are mutable, thus they cannot be hashed

You can convert each value to separately words by split and set s and check intersection by & , then convert values to boolean - empty sets are converted to False s and last convert it to int s - Falses are 0 s and True s are 1 s.

zipped = zip(df['A'], df['B'])
df['flag'] = [int(bool(set(a.split(',')) & set(b.split(',')))) for a, b in zipped]
print (df)
              A            B  flag
0  hello,hi,bye    bye,also     1
1   but,as well  see,pandas     0

Similar solution:

df['flag'] = np.array([set(a.split(',')) & set(b.split(',')) for a, b in zipped]).astype(bool).astype(int)
print (df)
              A            B  flag
0  hello,hi,bye    bye, also     1
1   but,as well  see, pandas     0

EDIT: There is possible some whitespaces before , , so add map with str.strip and also remove empty strings with filter :

df = pd.DataFrame({'A': ['hello,hi,bye', 'but,,,as well'], 
                   'B': ['bye ,,, also', 'see,,,pandas']})
print (df)

               A             B
0   hello,hi,bye  bye ,,, also
1  but,,,as well  see,,,pandas

zipped = zip(df['A'], df['B'])

def setify(x):
    return set(map(str.strip, filter(None, x.split(','))))

df['flag'] = [int(bool(setify(a) & setify(b))) for a, b in zipped]
print (df)
               A             B  flag
0   hello,hi,bye  bye ,,, also     1
1  but,,,as well  see,,,pandas     0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM