简体   繁体   English

检查列表是否是子列表

[英]Checking if list is a sublist

I need to check if list1 is a sublist of list2 (True; if every integer in list2 that is common with list1 is in the same order of indexes as in list1)我需要检查 list1 是否是 list2 的子列表(真;如果 list2 中与 list1 相同的每个整数都与 list1 中的索引顺序相同)

def sublist(lst1,lst2):
    for i in range(len(lst1)):
        if lst1[i] not in lst2:
            return False
        for j in range(len(lst2)):
            if (lst1[j] in lst2) and (lst2.index(lst1[i+1]) > lst2.index(lst1[i])):
                return True

Can anybody help me... why isn't this working?任何人都可以帮助我...为什么这不起作用?

i need to check if list1 is a sublist to list2 (True; if every integer in list2 that is common with list1 is in the same order of indexes as in list1)我需要检查 list1 是否是 list2 的子列表(真;如果 list2 中与 list1 相同的每个整数都与 list1 中的索引顺序相同)

Your code isn't working because as soon as a list element in ls1 doesn't occur in ls2 it will return False immediately.您的代码不起作用,因为只要 ls1 中的列表元素没有出现在 ls2 中,它就会立即返回 False 。

This creates two lists that contain only the common elements (but in their original order) and then returns True when they are the same:这将创建两个仅包含公共元素(但按其原始顺序)的列表,然后在它们相同时返回 True:

def sublist(lst1, lst2):
   ls1 = [element for element in lst1 if element in lst2]
   ls2 = [element for element in lst2 if element in lst1]
   return ls1 == ls2

edit: A memory-efficient variant:编辑:内存高效的变体:

def sublist(ls1, ls2):
    '''
    >>> sublist([], [1,2,3])
    True
    >>> sublist([1,2,3,4], [2,5,3])
    True
    >>> sublist([1,2,3,4], [0,3,2])
    False
    >>> sublist([1,2,3,4], [1,2,5,6,7,8,5,76,4,3])
    False
    '''
    def get_all_in(one, another):
        for element in one:
            if element in another:
                yield element

    for x1, x2 in zip(get_all_in(ls1, ls2), get_all_in(ls2, ls1)):
        if x1 != x2:
            return False

    return True

An easy way to check if all elements of a list are in other one is converting both to sets:检查列表的所有元素是否都在另一个元素中的一种简单方法是将两者都转换为集合:

def sublist(lst1, lst2):
    return set(lst1) <= set(lst2)

Another way that we do this is with collections.Counter .我们这样做的另一种方法是使用collections.Counter @L3viathan's second answer is the most efficient and fastest way to do it. @L3viathan 的第二个答案是最有效和最快的方法。

def sublist1(lst1, lst2):
    ls1 = [element for element in lst1 if element in lst2]
    ls2 = [element for element in lst2 if element in lst1]
    return ls1 == ls2


def sublist2(lst1, lst2):
    def get_all_in(one, another):
        for element in one:
            if element in another:
                yield element
    for x1, x2 in zip(get_all_in(lst1, lst2), get_all_in(lst2, lst1)):
        if x1 != x2:
            return False
    return True


def sublist3(lst1, lst2):
    from collections import Counter
    c1 = Counter(lst1)
    c2 = Counter(lst2)
    for item, count in c1.items():
        if count > c2[item]:
            return False
    return True


l1 = ["a", "b", "c", "c", "c", "d", "e"]
l2 = ["c", "a", "c", "b", "c", "c", "d", "d", "f", "e"]

s1 = lambda: sublist1(l1, l2)
s2 = lambda: sublist2(l1, l2)
s3 = lambda: sublist3(l1, l2)

from timeit import Timer
t1, t2, t3 = Timer(s1), Timer(s2), Timer(s3)
print(t1.timeit(number=10000))  # => 0.034193423241588035
print(t2.timeit(number=10000))  # => 0.012621842119714115
print(t3.timeit(number=10000))  # => 0.12714286673722477

His 2nd way is faster by an order of magnitude, but I wanted to mention the Counter variant because of its prevalence and usage outside of this scenario.他的第二种方式快了一个数量级,但我想提到 Counter 变体,因为它在这种情况下的流行和使用。

Another easy way is to use list comprehension And use the built-in function all to verify that all items in list1 are contained in list2.另一种简单的方法是使用列表推导并使用内置函数all来验证 list1 中的所有项都包含在 list2 中。

Example:例子:

list1 = ['1','2']
list2 = ['1','2',3]

all(i in list2 for i in list1)

Memory efficient solution based on M. Morgan's answer.基于 M. Morgan 的回答的内存高效解决方案。 Takes into consideration that in order to be a sublist, the sublist must be found in the same order in the super list.考虑到为了成为子列表,必须在超级列表中以相同的顺序找到子列表。

Variable k keeps track of the length of matched characters.变量k跟踪匹配字符的长度。 When this matches the length of our sublist we can return true.当这与我们的子列表的长度匹配时,我们可以返回 true。

Variable s keeps track of the starting value.变量s跟踪起始值。 I keep track of this so that a test case like sublist(["1", "1", "2"],["0", "1", "1", "1", "2", "1", "2"]) with extraneous repeats of the first entry don't affect the current index reset when unmatched.我跟踪这一点,以便像sublist(["1", "1", "2"],["0", "1", "1", "1", "2", "1", "2"])与第一个条目的无关重复在不匹配时不会影响当前索引重置。 Once the starting value changes s becomes irrelevant so this case does not fire in the middle of a pattern.一旦起始值发生变化s变得无关紧要,因此这种情况不会在模式中间触发。

def sublist(sublist, lst):
    if not isinstance(sublist, list):
        raise ValueError("sublist must be a list")
    if not isinstance(lst, list):
        raise ValueError("lst must be a list")

    sublist_len = len(sublist)
    k=0
    s=None

    if (sublist_len > len(lst)):
        return False
    elif (sublist_len == 0):
        return True

    for x in lst:
        if x == sublist[k]:
            if (k == 0): s = x
            elif (x != s): s = None
            k += 1
            if k == sublist_len:
                return True
        elif k > 0 and sublist[k-1] != s:
            k = 0

    return False

b = sublist and a = list then search b by splitting a in lengths of b b = sublista = list然后通过在b长度中拆分 a 来搜索b

eg例如

>>> a = [2,4,3,5,7] , b = [4,3]
>>> b in [a[i:len(b)+i] for i in xrange(len(a))]
True

>>> a = [2,4,3,5,7] , b = [4,10]
>>> b in [a[i:len(b)+i] for i in xrange(len(a))]

False
def sublist(l1,l2):
    s1=" ".join(str(i) for i in l1)
    s2=" ".join(str(i) for i in l2)
    if s1 in s2:
        return True
    else:
        return False

I found the above all found ['a','b','d'] to be a sublist of ['a','b','c','e','d'], which may not be true in spite of all of the elements of the sublist being present in the list.我发现以上所有发现 ['a','b','d'] 是 ['a','b','c','e','d'] 的子列表,这可能不是尽管子列表的所有元素都存在于列表中,但仍为 true。 So to maintain the order and I came up with:所以为了维持秩序,我想出了:

def sublist4(sublist,lst):
    #Define an temp array to populate 
    sub_list=[]
    comparable_sublist=[]
    #Define two constants to iterate in the while loop
    i=0
    k=0
    #Loop the length of lst
    while i < len(lst):
        #If the element is in the sublist append to temp array, 
        if k < len(sublist) and lst[i] == sublist[k]:
            sub_list.append(lst[i])
            #set a comparable array to the value of temp array
            comparable_sublist = sub_list
            k += 1
            #If the comparable array is the same as the sublist, break
            if len(comparable_sublist) == len(sublist):
                break

        #If the element is not in the sublist, reset temp array
        else:
            sub_list = []


        i += 1

    return comparable_sublist == sublist

Whilst this isn't very memory efficient, I find it works quite well with small lists.虽然这不是很有效的内存,但我发现它在小列表中工作得很好。

def has_ordered_intersection(xs, ys):
    common = {*xs} & {*ys}
    return all(x == y for x, y in zip((x for x in xs if x in common),
                                      (y for y in ys if y in common)))

This passes @L3viathan's doctest with fewer lines of code, using a similar strategy to the "memory-efficient variant", and with arguably greater overall efficiency.这通过了@L3viathan 的 doctest,代码行更少,使用类似于“内存高效变体”的策略,并且可以说整体效率更高。

>>> has_ordered_intersection([], [1,2,3])
True
>>> has_ordered_intersection([1,2,3,4], [2,5,3])
True
>>> has_ordered_intersection([1,2,3,4], [0,3,2])
False
>>> has_ordered_intersection([1,2,3,4], [1,2,5,6,7,8,5,76,4,3])
False

I used the intersection set instead of a generator because I think the extra memory is a good tradeoff compared to the time cost of shortcut-scanning the entire list per element (what in does to a list), especially if they are long.我用了交集,而不是一台发电机,因为我觉得额外的内存是一个很好的权衡比较的时间成本快捷扫描每个元素的整个列表(什么in做一个列表),特别是当它们长。

I also don't think this should be called a "sublist" since xs is allowed to have elements that ys does not.我也不认为这应该被称为“子列表”,因为允许xs具有ys没有的元素。 The above relation is symmetric: swapping the arguments doesn't change the answer.上述关系是对称的:交换参数不会改变答案。 A real ordered "sublist" would not be symmetric and look more like this真正有序的“子列表”不会是对称的,看起来更像这样

def is_ordered_sublist(xs, ys):
    xset = {*xs}
    return all(x == y for x, y in zip(xs, (y for y in ys if y in xset)))

Its easy with iterators.使用迭代器很容易。

>>> a = [0,1,2]
>>> b = [item for item in range(10)]
>>> b
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a
[0, 1, 2]
>>> [False, True][set([item in b for item in a]) == set([True])]
True
>>> a = [11, 12, 13]
>>> [False, True][set([item in b for item in a]) == set([True])]
False

Try this one!!试试这个!! sublist y is not missing the sequence of list x.子列表 y 不缺少列表 x 的序列。

x= list x= 列表

y= sublist y=子列表

if ([i for i,j in enumerate(y) for k,l in enumerate(x) if i == k and j!=l]):
    print("True")
else:
    print("False")

I have come up with a short way to check for sublist我想出了一个简短的方法来检查子列表

lst1=[1,2,5,6,8,3,2,34,3,4]
lst2=[1,2,3,4]


def sublist(lst1,lst2):
    for item in lst2:
        try:
           lst1.index(item)
        except ValueError:
           return False
     return True


 print(sublist(lst1,lst2))

what I have done is basically take 2 lists lst1 is the larger list and lst2 is the sublist that we are checking for.我所做的基本上是取 2 个列表 lst1 是较大的列表,而 lst2 是我们正在检查的子列表。 then I am taking each element of the lst2 and checking if it is in the lst1 by looking for its index然后我取 lst2 的每个元素并通过查找它的索引来检查它是否在 lst1 中

if it can't find even a single item it ..returns False如果它甚至找不到单个项目,它 ..returns False

if all the items are covered it returns True如果所有项目都被覆盖,则返回 True

Another way is to move through all possible sublists and return once a match was found另一种方法是遍历所有可能的子列表并在找到匹配项后返回

def is_sublist(ys, xs):
    for i in range(len(xs) - len(ys)):
        if xs[i:i + len(ys)] == ys:
            return True
    return False

Find in l1 all indexes where the element match with the first element in l2, then I loop over this indexes list and for each element get the slice of l1 with the same length of l2.在 l1 中查找元素与 l2 中的第一个元素匹配的所有索引,然后我遍历此索引列表,并为每个元素获取 l1 的切片,其长度与 l2 相同。 If the l1 slice is equal to l2, then l2 is a sublist of l1如果 l1 切片等于 l2,则 l2 是 l1 的子列表

Ex:前任:

l1 = [1,2,3,2,1,1,3,3,4,5] l1 = [1,2,3,2,1,1,3,3,4,5]

l2 = [2,1,1,3,3] l2 = [2,1,1,3,3]

True真的

l1 = [1,2,3,2,1,3,3,4,5] l1 = [1,2,3,2,1,3,3,4,5]

l2 = [2,1,1,3,3] l2 = [2,1,1,3,3]

False错误的

def is_sublist(l1, l2):
    index_list = [i for i, v in enumerate(l1) if v==l2[0]]
    for ii in index_list:
        l1_slice = l1[ii:ii+len(l2)]
        if l1_slice == l2:
            return True
    else:
        return False

what's wrong with the following:以下有什么问题:

def sublist(lst1, lst2):
return all([(x in lst2) for x in lst1])

will return true if for all items in lst1, each item exists in lst2如果对于 lst1 中的所有项目,每个项目都存在于 lst2 中,则将返回 true

def lis1(item,item1):
    sub_set = False
    for x in range(len(item)):
     if item[x] == item1[0]:
         n = 1
         while (n < len(item1) and (item[x + n] == item1[1])):
             n += 1
             if n == len(item1):
                 return True
    return False
a = [2,3,4,5,6]
b = [5,6]
c = [2,7,6]
print(lis1(a,b))
print(lis1(a,c))
#list1 = ['1','2',"4"]############works
#list2 = ['1','2',3]

lst2 = [4,8,9,33,44,67,123]
lst1 = [8,33,7] # works!


def sublist(lst1, lst2):
    'checks whether list lst1 is a sublist of list lst2'
    index1 = 0  # lst1 index
    index2 = 0  # lst2 index

    # go through indexes of lst1
    while index1 < len(lst1):

        # search for item in lst2 matching item in lst1 at index index1
        while index2 < len(lst2) and lst1[index1] != lst2[index2]:
            index2 += 1

        # if we run out of items in lst2, lst1 is not a sublist of lst2
        if index2 == len(lst2):
            return False
        index1 += 1

    # every item in lst1 has been matched to an item in lst2, from left to right
    return True

print( sublist(lst1, lst2))

I needed to know if the first list is the sub-list of the second one.我需要知道第一个列表是否是第二个列表的子列表。 This order was important to me.这个订单对我很重要。 I've tried some of the solutions, but they are too 'generic' for my needs.我已经尝试了一些解决方案,但它们对于我的需求来说太“通用”了。 I also wanted to make sure, that both lists are not equal.我还想确保两个列表不相等。 Here's the solution.这是解决方案。

def sublist(lst1, lst2):
    len1 = len(lst1)
    len2 = len(lst2)

    if len1 >= len2:
        return False

    for i in range(0, len1):
        if lst1[i] != lst2[i]:
            return False

    return True

How about just using an index that runs along list2 as we do the comparison?在我们进行比较时,只使用沿着 list2 运行的索引怎么样?

def is_ordered_sublist(lst1: list, lst2: list) -> bool:
""" Checks if lst1 is an ordered sublist of lst2 """

    try:
        index = 0
        for item in lst1:
            location = lst2[index:].index(item)
            index += location + 1
        return True
    except ValueError:
        return False

Basically for each item in list1 it simply finds the first index which it appears in the second list.基本上对于 list1 中的每个项目,它只是找到它出现在第二个列表中的第一个索引。 Thereafter it only needs to consider the remaining parts of list2.此后它只需要考虑 list2 的其余部分。 So the worse case complexity is simple O(len(list2)).所以更糟糕的情况复杂度是简单的 O(len(list2))。

This code attempts to find list1 in list2 by by scanning list2.此代码尝试通过扫描 list2 在 list2 中查找 list1。 It searches list2 for the first item in list1 and then checks to see if successive items in list1 also match at the location in list2 where the first item is found.它在 list2 中搜索 list1 中的第一项,然后检查 list1 中的后续项是否也与 list2 中找到第一项的位置匹配。 If the the first 2/4 items in list1 match at a location in list2 but the 3rd does not then it will not spend time comparing the 4th.如果 list1 中的前 2/4 项在 list2 中的某个位置匹配,但第 3 项不匹配,则不会花时间比较第 4 项。

def ordered_sublist(l1, l2):
    length = len(l1)
    for i in range(len(l2) - length + 1):
        if all(l1[j] == l2[j + i] for j in range(length)):
            return True
    return False

I think this is the best way to solve this problem.我认为这是解决这个问题的最好方法。 This will check if list1 is a sublist of list2 .这将检查list1是否是list2的子列表。 We will assume that all elements are unique.我们将假设所有元素都是唯一的。 If we have duplicate elements the following code will only ensure that each element of list1 is contained in list2 .如果我们有重复的元素,下面的代码只会确保list1每个元素都包含在list2 Hence, we do not take multiplicity into account.因此,我们不考虑多重性。

list1 = [2, 3, 3, 4, 5, 9]
list2 = [1, 2, 3, 4, 5, 6, 7, 8, 9]

set(list1).issubset(set(list2))

Lists are data structures where the order of the elements matters.列表是数据结构,其中元素的顺序很重要。

I understand that this question explicitly specifies "same order of indexes" but in general, when you say "sublist", this is not necessarily the only restriction that applies.我知道这个问题明确指定了“索引的相同顺序”,但一般来说,当您说“子列表”时,这不一定是唯一适用的限制。 The relative position between each element may also be a restriction.每个元素之间的相对位置也可能是一个限制。

In my particular case list1=[1,2,3,4] list2=[1,2,4] and list2 is not a sublist of list1, but list3=[2,3,4] is a sublist of list1.在我的特殊情况下 list1=[1,2,3,4] list2=[1,2,4] 并且 list2 不是 list1 的子列表,但 list3=[2,3,4] 是 list1 的子列表。

Just for the sake of completion, I am posting here my code to find sublists where the relative index of each element also should be preserved.只是为了完成,我在这里发布我的代码来查找子列表,其中每个元素的相对索引也应该保留。

def is_sublist(list1, list2):
    first_index = -1
    for i in range(len(list1)):
        if first_index>=0:
            j = i-first_index
            if list1[i] != list2[j]:
                return False
            if j == len(list2)-1:
                return True
         elif list1[i] == list2[0]:
            first_index = i
    return False
print(is_sublist(['r1','r2','r3','r4','r6'],['r1','r2','r3']))
#>> True
print(is_sublist(['r1','r2','r3','r4','r6'],['r2','r3','r4']))
#>> True
print(is_sublist(['r1','r2','r3','r4','r6'],['r1','r2','r4']))
#>> False

Definition:定义:

  • List A is a sublist of list B if the exact sequence of elements of A exists in B.如果 A的元素确切序列存在于 B 中,则列表 A 是列表 B 的子列表。
  • An empty list is a sublist of any list.空列表是任何列表的子列表。

The following function returns the index of the first occurrence of list_a in list_b , otherwise -1 is returned.以下函数返回list_a中第一次出现list_alist_b ,否则返回-1 For empty list_a , 0 is returned.对于空的list_a ,返回0

def sublist(list_a, list_b):
    if 0 == len(list_a):
        return 0

    if len(list_b) < len(list_a):
        return -1

    idx = -1
    while list_a[0] in list_b[idx+1:]:
        idx = list_b.index(list_a[0], idx + 1)
        if list_a == list_b[idx:idx+len(list_a)]:
            return idx

    return -1

Some tests:一些测试:

>>> sublist([], [])
0
>>> sublist([], [1, 2, 3])
0
>>> sublist([3, 6], [1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6])
2
>>> sublist([3, 7, 9, 8], [1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6])
4
>>> sublist([3, 6, 3, 7, 9, 8, 0, 3, 6], [1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6])
2
>>> sublist([1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6, 4], [1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6])
-1
>>> sublist([3, 7, 4], [1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6])
-1

Here's a lazy-iteration, generic version for checking whether an iterable is a subsequence of another iterable:这是一个惰性迭代的通用版本,用于检查一个可迭代对象是否是另一个可迭代对象的子序列:

from typing import Iterable

def is_subsequence(a: Iterable, b: Iterable) -> bool:
    b_iterator = iter(b)
    for x in a:
        for y in b_iterator:
            if y == x:
                break
        else:
            return False
    else:
        return True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM