在Python中，如何基于字符串列表从列表中删除项目？

Question

I have a list of strings that I want to remove items from. 我有一个要从中删除项目的字符串列表。 I have a list of keywords that I am searching for in these items. 我在这些项目中有要搜索的关键字列表。 I cannot seem to get the output I am looking for. 我似乎无法获得所需的输出。 I am not sure if regular expressions are the right way to handle this. 我不确定正则表达式是否是处理此问题的正确方法。
I want the output to be ['/item/page/cat-dog', '/item/page/animal-planet'] 我希望输出为['/ item / page / cat-dog'，'/ item / page / animal-planet']

valid = ['/item/page/cat-dog', '/item/page/animal-planet', '/item/page/variable']
keywords = ['cat','planet']


for item in valid: 
    #a = re.findall()
    #

Answer 1

Python comes with the handy keywords in and not in to test if an object is or is not in a list. Python附带了方便的关键字in而not in使用方便的关键字来测试对象是否在列表中。

for your problem, you can simply do : 对于您的问题，您可以执行以下操作：

new_list = []
for item in valid: 
    if os.path.basename(item) not in keywords:
        new_list.append(item)

os.path.basename gives the name of the files without the arborescence. os.path.basename给出不带树状文件的文件名。 new_list will then contain all the elements of valid in which the filenames were not in keyword . 然后new_list将包含文件名中没有keyword所有valid元素。

Answer 2

据我所知，根据@ dan-d的评论，您需要的是

[s for s in valid if not any(q in s for q in keywords)]

Answer 3

As suggested in the comments and other answers, the in operator may be used to check if a string is a substring of another string. 如注释和其他答案中所建议， in运算符可用于检查一个字符串是否是另一个字符串的子字符串。 For the example data in the question, using in is the simplest and fastest way to get the desired result. 对于问题中的示例数据，使用in是获得所需结果的最简单，最快的方法。

If the requirement is to match '/item/page/cat-dog' but not '/item/page/catapult' - that is only match the word 'cat', not just the sequence cat , then a regular expression may be used to do the matching. 如果要求匹配'/ item / page / cat-dog'而不匹配'/ item / page / catapult'-仅匹配单词 'cat'，而不仅是序列cat ，那么可以使用正则表达式进行匹配。

The pattern to match a single word is '\\bfoo\\b' where '\\b' marks a word boundary. 匹配单个单词的模式是'\\bfoo\\b' ，其中'\\b'标记单词边界。

The alternation operator '|' 交替运算符'|' is used to match one pattern or another, for example 'foo|bar' matches 'foo' or 'bar'. 用于匹配一个或另一个模式，例如'foo|bar'匹配'foo' 或 'bar'。

Construct a pattern that matches the words in keywords ; 构建与keywords中的单词匹配的模式; call re.escape on each keyword in case they contain characters that the regex engine might interpret as metacharacters. 如果每个关键字包含正则表达式引擎可能会解释为元字符的字符，请对每个关键字调用re.escape 。

>>> pattern = r'|'.join(r'\b{}\b'.format(re.escape(keyword)) for keyword in keywords)
>>> pattern
'\\bcat\\b|\\bplanet\\b'

Compile the pattern into a regular expression object . 将模式编译为正则表达式对象。

>>> rx = re.compile(pattern)

Find the matches: using filter is elegant: 找到匹配项：使用filter很优雅：

>>> matches = list(filter(rx.search, valid))
>>> matches
['/item/page/cat-dog', '/item/page/animal-planet']

But it's common to use a list comprehension : 但是使用列表推导是很常见的：

>>> matches = [word for word in valid if rx.search(word)]
>>> matches
['/item/page/cat-dog', '/item/page/animal-planet']

在Python中，如何基于字符串列表从列表中删除项目？

问题描述

3 个解决方案

解决方案1
0 2019-04-26 15:00:57

解决方案2
0 2019-04-26 15:07:18

解决方案3
0 2019-04-27 07:50:30

在Python中，如何基于字符串列表从列表中删除项目？

问题描述

3 个解决方案

解决方案1 0 2019-04-26 15:00:57

解决方案2 0 2019-04-26 15:07:18

解决方案3 0 2019-04-27 07:50:30

解决方案1
0 2019-04-26 15:00:57

解决方案2
0 2019-04-26 15:07:18

解决方案3
0 2019-04-27 07:50:30