[英]How to check which words from a list are contained in a string?
我希望从 python 中的字符串中包含的列表中收集每个单词。 我找到了一些解决方案,但到目前为止我得到:
data = "Today I gave my dog some carrots to eat in the car"
tweet = data.lower() #convert to lower case
split = tweet.split()
matchers = ['dog','car','sushi']
matching = [s for s in split if any(xs in s for xs in matchers)]
print(matching)
结果是
['dog', 'carrots', 'car']
如何解决结果只有狗和汽车而不给我的匹配器添加空间?
另外,我如何从数据字符串中删除任何 $ 符号(例如),但没有其他特殊字符,如 @?
How do I fix that the result is only dog and car without adding spaces to my matchers?
要使用您当前的代码执行此操作,请替换此行:
matching = [s for s in split if any(xs in s for xs in matchers)]
有了这个:
matching = []
# iterate over all matcher words
for word in matchers:
if word in split: # check if word is in the split up words
matching.append(word) # add word to list
您还提到了这一点:
Also how would I remove any $ signs (as example) from the data string but no other special characters like @?
为此,我将创建一个包含要删除的字符的列表,如下所示:
things_to_remove = ['$', '*', '#'] # this can be anything you want to take out
然后,只需在拆分之前从推文字符串中删除每个字符。
for remove_me in things_to_remove:
tweet = tweet.replace(remove_me, "")
所以最后一个代码块演示了所有这些主题:
data = "Today I@@ gave my dog## some carrots to eat in the$ car"
tweet = data.lower() #convert to lower case
things_to_remove = ['$', '*', '#']
for remove_me in things_to_remove:
tweet = tweet.replace(remove_me, "")
print("After removeing characters I don't want:")
print(tweet)
split = tweet.split()
matchers = ['dog','car','sushi']
matching = []
# iterate over all matcher words
for word in matchers:
if word in split: # check if word is in the split up words
matching.append(word) # add word to list
print(matching)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.