[英]How do I get strings that match substrings in a pandas df column in another column?
I have a list of strings, Skills, and a pandas dataframe with descriptions in each row under column labeled "Job Summary". 我有一个字符串,技能和一个pandas数据框列表,其中每行描述标记为“作业摘要”。 I want to see if any of the strings in Skills are a substring in the "Job Summary" Column.
我想看看Skills中的任何字符串是否是“作业摘要”列中的子字符串。 If there are matches then to have the matching string appear in a column labeled Matches.
如果存在匹配,则匹配的字符串将出现在标记为匹配的列中。 If there is more than one then it should appear as a list of strings.
如果有多个,那么它应该显示为字符串列表。 Right now I have it so it tells me True or False, but I want the words themselves that match.
现在我有它,所以它告诉我是对还是错,但我希望这些单词本身匹配。
See what I currently have below 看看我目前在下面有什么
#Sample list (Real list is much longer)
Skills=['Science', 'Management','Equipment','Analysis']
skills=list(map(str.lower,skills))
joined='|'.join(skills)
df['Matches']=df['Job Summary'].str.contains(joined)
results in df['Matches'] tell me True or False. 结果df ['匹配']告诉我是对还是错。 I want the word that matches
我想要匹配的单词
Using str.findall
使用
str.findall
df=pd.DataFrame({'Job Summary':['Science Equipment','Analysis is Management']})
df['Job Summary'].str.findall('|'.join(Skills))
Out[95]:
0 [Science, Equipment]
1 [Analysis, Management]
Name: Job Summary, dtype: object
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.