简体   繁体   中英

How to find if any word in a column matches any word from another column by row in python

I am trying to see if any word from colA is contained in colB in a python dataframe.

example data

ColA                    ColB            Match
this is some text       some text       TRUE
some more text          more            TRUE
another line text       nothing to see  FALSE
my final line           dog cats goats  FALSE

desc split string, emp split string if any word in emp = any word in desc then true else false

something like...

df['Match'] = df['colA'].str.split().apply(lambda x: 'true' if any x in df['ColB'].str.split() else 'false')

thx

您可以在整个行上使用apply,如下所示:

df.apply(lambda x: np.any([word in x.ColB.split(' ') for word in x.ColA.split(' ')]),axis = 1)

Maybe using issubset

[set(y).issubset(set(x)) for x , y  in zip(df.ColA.str.split(),df.ColB.str.split())]
Out[57]: [True, True, False, False]

If we need only on match

[len(list(set(x) & set(y)))>0 for x , y  in zip(df.ColA.str.split(),df.ColB.str.split())]
Out[61]: [True, True, False, False]

You can use a list comprehension with zip and a custom function:

def find_words(words, val):
    val_split = val.split()
    return any(x in val_split for x in words.split())

df['Match'] = [find_words(a, b) for a, b in zip(df['ColA'], df['ColB'])]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM