过滤字符串列表，使其不包含来自另一个列表的任何字符串作为子字符串

Question

I have following code to select the values which are not contained in the another list. 我有以下代码来选择未包含在另一个列表中的值。

import re
isbn  = ["1111","2222","3333","4444","5555"]
sku = ["k1 1111", "k2 2222", "k3 3333", "k4 4444", "k5 5555", "k6 6666", "k7 7777", "k8 8888" ,"k9 1111"]

for x in isbn:
    for i in sku:
        if x not in i:
            print (i)

Expected outcome should be like this: 预期结果应如下：

k6 6666
k7 7777
k8 8888

But I get all unmatched values. 但我得到了所有无与伦比的价值观。 How can I get the expected outcome as I showed above. 如上所示，我怎样才能得到预期的结果。

Answer 1

You should be using any within your loop. 你应该在你的循环中使用any 。 Infact you may achieve it using below list comprehension as: 事实上，你可以使用下面的列表理解来实现它：

>>> list_1  = ["1111","2222","3333","4444","5555"]
>>> list_2 = ["k1 1111", "k2 2222", "k3 3333", "k4 4444", "k5 5555", "k6 6666", "k7 7777", "k8 8888" ,"k9 1111"]

>>> [x for x in list_2 if not any( y in x for y in list_1)]
['k6 6666', 'k7 7777', 'k8 8888']

Here any will return True if any of string in list_1 is present as substring in list2 . 如果list_1任何字符串作为list2子字符串存在，则any将返回True 。 As soon as it finds the match, it will short-circuit the iteration (without checking for other matches) and will return the result as True . 一旦找到匹配，它将使迭代短路（不检查其他匹配）并将结果返回True 。

In case if you are not interested in using any , you may get the same result with the below for loop as: 如果您对使用any不感兴趣，可以使用以下for循环获得相同的结果：

for x in list_2:
    for y in list_1:
        if y in x:
            break
    else:
        print(x)

which will print your desired output: 这将打印您想要的输出：

k6 6666
k7 7777
k8 8888

Answer 2

You would need to test all values in isbn before you can conclude none of those match. 您需要先测试isbn 所有值，然后才能得出这些值中的所有值。

Rather than loop over isbn first, loop over sku and test that value with each of the isbn values; 而不是首先遍历isbn ，循环遍历sku并使用每个isbn值测试该值; the any() function makes that easier and more efficient: any()函数使得更容易和更有效：

for value in sku:
    if not any(i in value for i in isbn):
        print(value)

More efficient still would be to split out the ISBN portion, and test against a set: 更高效的仍然是拆分 ISBN部分，并测试一组：

isbn_set = set(isbn)
for value in sku:
    isbn_part = value.partition(' ')[-1]  # everything after the first space
    if isbn_part not in isbn_set:
        print(value)

This avoids looping over isbn altogther; 这避免了在isbn altogther上的循环; set membership testing takes O(1) constant time; 集合成员测试需要O（1）恒定时间; for N skus and M ISBN values, this makes a O(N) loop (vs O(NM) loop with any() ). 对于N skus和M ISBN值，这使得O（N）循环（对O（NM）循环与any() ）。

Either version can be converted to a list comprehension to produce a list of matches; 可以将任一版本转换为列表解析以生成匹配列表; the preferred set version then becomes: 然后首选的设置版本变为：

isbn_set = set(isbn)
not_matched = [value for value in sku if value.partition(' ')[-1] not in isbn_set]

Demo of the latter: 演示后者：

>>> isbn  = ["1111","2222","3333","4444","5555"]
>>> sku = ["k1 1111", "k2 2222", "k3 3333", "k4 4444", "k5 5555", "k6 6666", "k7 7777", "k8 8888" ,"k9 1111"]
>>> isbn_set = set(isbn)
>>> [value for value in sku if value.partition(' ')[-1] not in isbn_set]
['k6 6666', 'k7 7777', 'k8 8888']

Answer 3

If you remove matches from a set, then the left over set is what you are after: 如果你从一个集合中删除匹配，那么左边的集合就是你所追求的：

Code: 码：

skus = set(sku)
for x in isbn:
    skus -= {i for i in skus if x in i}

Test Code: 测试代码：

isbn = ["1111", "2222", "3333", "4444", "5555"]
sku = ["k1 1111", "k2 2222", "k3 3333", "k4 4444", "k5 5555", "k6 6666",
       "k7 7777", "k8 8888", "k9 1111"]

skus = set(sku)
for x in isbn:
    skus -= {i for i in skus if x in i}
print(skus)

Results: 结果：

{'k6 6666', 'k7 7777', 'k8 8888'}

过滤字符串列表，使其不包含来自另一个列表的任何字符串作为子字符串

问题描述

3 个解决方案

解决方案1
8 已采纳 2018-01-14 19:37:36

解决方案2
2 2018-01-14 19:37:56

解决方案3
0 2018-01-14 19:42:10

Code: 码：

Test Code: 测试代码：

Results: 结果：

过滤字符串列表，使其不包含来自另一个列表的任何字符串作为子字符串

问题描述

3 个解决方案

解决方案1 8 已采纳 2018-01-14 19:37:36

解决方案2 2 2018-01-14 19:37:56

解决方案3 0 2018-01-14 19:42:10

Code: 码：

Test Code: 测试代码：

Results: 结果：

解决方案1
8 已采纳 2018-01-14 19:37:36

解决方案2
2 2018-01-14 19:37:56

解决方案3
0 2018-01-14 19:42:10