从列表中删除子列表

Question

我想在 Python 中执行以下操作：

A = [1, 2, 3, 4, 5, 6, 7, 7, 7]
C = A - [3, 4]  # Should be [1, 2, 5, 6, 7, 7, 7]
C = A - [4, 3]  # Should not be removing anything, because sequence 4, 3 is not found

所以，我只想从另一个列表中删除子列表（作为序列）的第一次出现。 我怎样才能做到这一点？

编辑：我说的是列表，而不是集合。 这意味着项目是排序（序列）物质（无论是在A和B），以及重复。

Answer 1

使用套装：

C = list(set(A) - set(B))

如果您想保留重复项和/或奥德：

filter_set = set(B)
C = [x for x in A if x not in filter_set]

Answer 2

如果要删除精确序列，这是一种方法：

通过检查子列表是否与所需序列匹配来查找坏索引：

bad_ind = [range(i,i+len(B)) for i,x in enumerate(A) if A[i:i+len(B)] == B]
print(bad_ind)
#[[2, 3]]

由于这将返回一个列表列表，将其展平并将其转换为一个集合：

bad_ind_set = set([item for sublist in bad_ind for item in sublist])
print(bad_ind_set)
#set([2, 3])

现在使用此集合按索引过滤您的原始列表：

C = [x for i,x in enumerate(A) if i not in bad_ind_set]
print(C)
#[1, 2, 5, 6, 7, 7, 7]

上面的bad_ind_set将删除序列的所有匹配项。 如果您只想删除第一个匹配项，则更简单。 您只需要bad_ind的第一个元素（无需展平列表）：

bad_ind_set = set(bad_ind[0])

更新：这是一种使用短路for循环查找和删除第一个匹配子序列的方法。 这会更快，因为一旦找到第一个匹配项，它就会爆发。

start_ind = None
for i in range(len(A)):
    if A[i:i+len(B)] == B:
        start_ind = i
        break

C = [x for i, x in enumerate(A) 
     if start_ind is None or not(start_ind <= i < (start_ind + len(B)))]
print(C)
#[1, 2, 5, 6, 7, 7, 7]

Answer 3

我认为这个问题就像一个子字符串搜索，因此可以在这里应用KMP 、 BM等子字符串搜索算法。 即使您想支持多种模式，也有一些多种模式算法，例如Aho-Corasick 、 Wu-Manber等。

下面是由 Python 实现的 KMP 算法，它来自 GitHub Gist。 PS：作者不是我。 我只想分享我的想法。

class KMP:
    def partial(self, pattern):
        """ Calculate partial match table: String -> [Int]"""
        ret = [0]

        for i in range(1, len(pattern)):
            j = ret[i - 1]
            while j > 0 and pattern[j] != pattern[i]:
                j = ret[j - 1]
            ret.append(j + 1 if pattern[j] == pattern[i] else j)
        return ret

    def search(self, T, P):
        """
        KMP search main algorithm: String -> String -> [Int]
        Return all the matching position of pattern string P in S
        """
        partial, ret, j = self.partial(P), [], 0

        for i in range(len(T)):
            while j > 0 and T[i] != P[j]:
                j = partial[j - 1]
            if T[i] == P[j]: j += 1
            if j == len(P):
                ret.append(i - (j - 1))
                j = 0

        return ret

然后用它来计算出匹配的位置，最后移除匹配：

A = [1, 2, 3, 4, 5, 6, 7, 7, 7, 3, 4]
B = [3, 4]
result = KMP().search(A, B)
print(result)
#assuming at least one match is found
print(A[:result[0]:] + A[result[0]+len(B):])

输出：

[2, 9]
[1, 2, 5, 6, 7, 7, 7, 3, 4]
[Finished in 0.201s]

PS ：您也可以尝试其他算法。 除非您非常关心性能，否则@Pault 的答案已经足够好了。

Answer 4

这是另一种方法：

# Returns that starting and ending point (index) of the sublist, if it exists, otherwise 'None'.

def findSublist(subList, inList):
    subListLength = len(subList)
    for i in range(len(inList)-subListLength):
        if subList == inList[i:i+subListLength]:
            return (i, i+subListLength)
    return None


# Removes the sublist, if it exists and returns a new list, otherwise returns the old list.

def removeSublistFromList(subList, inList):
    indices = findSublist(subList, inList)
    if not indices is None:
        return inList[0:indices[0]] + inList[indices[1]:]
    else:
        return inList


A = [1, 2, 3, 4, 5, 6, 7, 7, 7]

s1 = [3,4]
B = removeSublistFromList(s1, A)
print(B)

s2 = [4,3]
C = removeSublistFromList(s2, A)
print(C)

Answer 5

A = [1, 2, 3, 4, 5, 6, 7, 7, 7]
C=[3,4]
A.remove(C[0])
A.remove(C[1])
return A

从列表中删除子列表

问题描述

4 个解决方案

解决方案1
8 2018-04-09 16:39:18

解决方案2
3 2018-04-09 16:51:49

解决方案3
3 2018-04-09 17:07:15

解决方案4
1 2018-04-09 18:48:17

解决方案5
0 2021-12-29 10:35:21

从列表中删除子列表

问题描述

4 个解决方案

解决方案1 8 2018-04-09 16:39:18

解决方案2 3 2018-04-09 16:51:49

解决方案3 3 2018-04-09 17:07:15

解决方案4 1 2018-04-09 18:48:17

解决方案5 0 2021-12-29 10:35:21

解决方案1
8 2018-04-09 16:39:18

解决方案2
3 2018-04-09 16:51:49

解决方案3
3 2018-04-09 17:07:15

解决方案4
1 2018-04-09 18:48:17

解决方案5
0 2021-12-29 10:35:21