如果字符串不在另一个单项字符串列表列表中，则从列表中删除字符串

Question

我有两个字符串列表如下：

good_tags = ['c#', '.net', 'java']

all_tags = [['c# .net datetime'],
            ['c# datetime time datediff relative-time-span'], 
            ['html browser timezone user-agent timezone-offset']]

我的目标是只保留“all_tags”中字符串列表中的“good_tags”，例如，

'all_tags' 的第一行：[c# .net datetime]
应成为（基于我想保留在“good_tags”中的字符串列表）：[c# .net]

我尝试使用“in”而不是“not in”，基于从另一个列表中删除一个列表中出现的所有元素

y3 = [x for x in all_tags if x in good_tags]
print ('y3: ', y3) 
y4 = [x for x in good_tags if x in all_tags]
print ('y4: ', y4)

出去：

y3:  []
y4:  []

Answer 1

首先，您没有两个字符串列表。 您有字符串列表列表。

good_tags = ['c#', '.net', 'java']

all_tags = [['c# .net datetime'],['c# datetime time datediff relative-time-span'], ['html browser timezone user-agent timezone-offset']]

all_tags_with_good_tags = []

for tags in all_tags:
    new_good_tags = set()
    for tag in tags[0].split():  # here you have list, so you need to select 0 element 
                                 #  of it as there's only 1 list element in your example 
                                 #  and then split it on the whitespace to be a list of tags
        if tag in good_tags:
            new_good_tags.add(tag)
    if new_good_tags:
        all_tags_with_good_tags.append(' '.join(new_good_tags))

会得到你

['.net c#', 'c#']

Answer 2

您的all_tags是一个列表，其中包含三个列表，其中每个列表包含一个字符串。 所以你首先需要做的是将每个子列表转换成一个包含字符串的列表，而不仅仅是一个字符串。

由于您在那里只有空格，即分隔标签而没有逗号，您必须将列表从['c# .net datetime']为['c#', '.net', 'datetime'] ：

[x for segments in all_tags[0] for x in segments.split()]

然后你可以为你的整个列表做这个，所以迭代它的长度：

[[x for segments in all_tags[entry] for x in segments.split()] for entry in range(len(all_tags))]

返回：

[['c#', '.net', 'datetime'],
 ['c#', 'datetime', 'time', 'datediff', 'relative-time-span'],
 ['html', 'browser', 'timezone', 'user-agent', 'timezone-offset']]

现在您可以根据您的好标签过滤此列表：

y3 = [[x for x in [words for segments in all_tags[entry] for words in segments.split()] if x in good_tags] for entry in range(len(all_tags))]

输出：

[['c#', '.net'], ['c#'], []]

Answer 3

good_tags = ['c#', '.net', 'java']

all_tags = [
    ['c# .net datetime'],
    ['c# datetime time datediff relative-time-span'],
    ['html browser timezone user-agent timezone-offset']
]

filtered_tags = [[" ".join(filter(lambda tag: tag in good_tags, row[0].split()))] for row in all_tags]
print(filtered_tags)

输出：

[['c# .net'], ['c#'], ['']]
>>>

Answer 4

使用set而不是列表的简短解决方案：

good_tags = {'c#', '.net', 'java'}  # this is a set
all_tags = [['c# .net datetime'],
            ['c# datetime time datediff relative-time-span'],
            ['html browser timezone user-agent timezone-offset']]


result = [set(lst[0].split()) & good_tags for lst in all_tags]

&创建集合的交集。

但真正的问题是：为什么包含只有一个元素的列表的all_tags ？ 首先可能有更好的方法来构建这个列表。

Answer 5

第一条语句：当“x in all_tags”执行时，它会给出 ['c# .net datetime'] 列表类，而 'c# .net datetime' 是单个字符串，不会单独处理。

第二条语句：在第一条语句 x = ['c# .net datetime'] 之后，现在列表将在不包含整个列表的 good_tags 中搜索，因此不会返回任何内容。

条件 1 ：如果我们的 good_tags 像 ['c#', '.net', 'java', ['c# .net datetime'] ] 那么它将返回 ['c# .net datetime']

这是您的解决方案的问题：

good_tags = ['c#', '.net', 'java']

all_tags = [['c# .net datetime'], ['c# datetime time datediff relative-time-span'],
            ['html browser timezone user-agent timezone-offset']]


#y3 = [x for x in all_tags if x in good_tags]
all_tags_refine = []
for x in all_tags:
    y = x[0].split()

    z = [k for k in y if k in good_tags]
    all_tags_refine.append(z)

print(all_tags_refine)

Answer 6

可能有更好的方法来做到这一点，但在这里，

good_tags = ['c#', '.net', 'java']

all_tags = [['c# .net datetime'],['c# datetime time datediff relative-time-span'], ['html browser timezone user-agent timezone-offset']]

for tags in all_tags:
    empty = []
    for tag in tags[0].split(" "):
        if tag in good_tags:
            empty.append(tag)
    print(" ".join(empty))

Answer 7

good_tags = ['c#', '.net', 'java']

all_tags = [['c# .net datetime'],['c# datetime time datediff relative-time-span'], ['html browser timezone user-agent timezone-offset']]

new_tags = []

for _ in all_tags:
    tags = _[0].split()
    newtag = ''
    for tag in tags:
        if tag in good_tags:
            if newtag == '':
                newtag = tag
            else:
                newtag = newtag + ' ' + tag
                
    if newtag != '':
        l = []
        l.append(newtag)
        new_tags.append(l)
        
print(new_tags)

Answer 8

good_set = set(good_tags)
kept_tags = [[t for t in tags[0].split() if t in good_set] 
    for tags in all_tags]
print(kept_tags)
# [['c#', '.net'], ['c#'], []]

如果字符串不在另一个单项字符串列表列表中，则从列表中删除字符串

问题描述

8 个解决方案

解决方案1
0 2020-09-24 09:30:56

解决方案2
0 2020-09-24 09:44:37

解决方案3
0 2020-09-24 09:45:33

解决方案4
0 2020-09-24 09:55:08

解决方案5
0 2020-09-24 09:58:21

解决方案6
-1 2020-09-24 09:40:44

解决方案7
-1 2020-09-24 09:50:42

解决方案8
-1 2020-09-24 09:56:27

如果字符串不在另一个单项字符串列表列表中，则从列表中删除字符串

问题描述

8 个解决方案

解决方案1 0 2020-09-24 09:30:56

解决方案2 0 2020-09-24 09:44:37

解决方案3 0 2020-09-24 09:45:33

解决方案4 0 2020-09-24 09:55:08

解决方案5 0 2020-09-24 09:58:21

解决方案6 -1 2020-09-24 09:40:44

解决方案7 -1 2020-09-24 09:50:42

解决方案8 -1 2020-09-24 09:56:27

解决方案1
0 2020-09-24 09:30:56

解决方案2
0 2020-09-24 09:44:37

解决方案3
0 2020-09-24 09:45:33

解决方案4
0 2020-09-24 09:55:08

解决方案5
0 2020-09-24 09:58:21

解决方案6
-1 2020-09-24 09:40:44

解决方案7
-1 2020-09-24 09:50:42

解决方案8
-1 2020-09-24 09:56:27