简体   繁体   中英

How do I get strings that match substrings in a pandas df column in another column?

I have a list of strings, Skills, and a pandas dataframe with descriptions in each row under column labeled "Job Summary". I want to see if any of the strings in Skills are a substring in the "Job Summary" Column. If there are matches then to have the matching string appear in a column labeled Matches. If there is more than one then it should appear as a list of strings. Right now I have it so it tells me True or False, but I want the words themselves that match.

See what I currently have below

     #Sample list (Real list is much longer)
     Skills=['Science', 'Management','Equipment','Analysis']
     skills=list(map(str.lower,skills))

     joined='|'.join(skills)

     df['Matches']=df['Job Summary'].str.contains(joined)

results in df['Matches'] tell me True or False. I want the word that matches

Using str.findall

df=pd.DataFrame({'Job Summary':['Science Equipment','Analysis is Management']})
df['Job Summary'].str.findall('|'.join(Skills))
Out[95]: 
0      [Science, Equipment]
1    [Analysis, Management]
Name: Job Summary, dtype: object

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM