繁体   English   中英

如何检查列表中的哪些单词包含在字符串中?

[英]How to check which words from a list are contained in a string?

我希望从 python 中的字符串中包含的列表中收集每个单词。 我找到了一些解决方案,但到目前为止我得到:

data = "Today I gave my dog some carrots to eat in the car"
tweet = data.lower()                             #convert to lower case
split = tweet.split()

matchers = ['dog','car','sushi']
matching = [s for s in split if any(xs in s for xs in matchers)]
print(matching)

结果是

['dog', 'carrots', 'car']

如何解决结果只有狗和汽车而不给我的匹配器添加空间?

另外,我如何从数据字符串中删除任何 $ 符号(例如),但没有其他特殊字符,如 @?

How do I fix that the result is only dog and car without adding spaces to my matchers?

要使用您当前的代码执行此操作,请替换此行:

matching = [s for s in split if any(xs in s for xs in matchers)]

有了这个:

matching = []
# iterate over all matcher words
for word in matchers:
    if word in split:  # check if word is in the split up words
        matching.append(word)  # add word to list

您还提到了这一点:

Also how would I remove any $ signs (as example) from the data string but no other special characters like @?

为此,我将创建一个包含要删除的字符的列表,如下所示:

things_to_remove = ['$', '*', '#']  # this can be anything you want to take out

然后,只需在拆分之前从推文字符串中删除每个字符。

for remove_me in things_to_remove:
    tweet = tweet.replace(remove_me, "")

所以最后一个代码块演示了所有这些主题:

data = "Today I@@ gave my dog## some carrots to eat in the$ car"
tweet = data.lower()                             #convert to lower case

things_to_remove = ['$', '*', '#']

for remove_me in things_to_remove:
    tweet = tweet.replace(remove_me, "")
print("After removeing characters I don't want:")
print(tweet)

split = tweet.split()

matchers = ['dog','car','sushi']

matching = []
# iterate over all matcher words
for word in matchers:
    if word in split:  # check if word is in the split up words
        matching.append(word)  # add word to list
print(matching)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM