简体   繁体   中英

How to check if a list (string) contains another list (string) considering order

I have two list (or string): one is big, and the other one is small. I want to check whether the bigger one (A) contains the small one (B) or not.

My expectation is as follows:

Case 1. B is a subset of A

A = [1,2,3]
B = [1,2] 
contains(A, B) = True

Case 2. B is not a subset of A, but order [1,2] is maintained in A

A = [1,3,2]
B = [1,2]
contains(A, B) = True

Case 3. False because 4 in not A

A = [1,3,2]
B = [1,4]
contains(A, B) = False

Case 4. False because the order [2,1] not maintained in A, even though A contains 1 and 2.

A = [1,3,2]
B = [2,1]
contains(A, B) = False

A and B could be string.

Direct Imperative Approach

I'm pretty sure checking whether one list is a sublist of another is a classic greedy algorithm. We can scan the larger list, trying to find each item in the smaller list in order. We never need to backtrack, because the first occurrence of each element is fine.

def contains(larger, smaller):
  # Take an iterator so that we always pick up where we left off.
  larger_iter = iter(larger)
  for s in smaller:
    for l in larger_iter:
      if s == l:
        break
    else:
      # We'll enter the else block if we *didn't* break in the loop,
      # in which case we never found a match for s.
      return False
  return True

This will run linear in the size of the larger list, because we iterate it at most once.

Functional Approach

Edit. I wondered last night if there were a smaller (line-wise) solution that was still linear, and I have one now that I like.

def contains(larger, smaller):
  larger_iter = iter(larger)
  return all(s in larger_iter for s in smaller)

This is following the exact same algorithm as above, just using a higher-level function to handle some of the bookkeeping. s in larger_iter corresponds to the inner for-loop with else block, and all with generator corresponds to the outer for-loop.

I believe that this answer should work if you just don't remove things from the sublist that aren't in the test list. So for the case of the first method provided there

def contains(testList, subList):
   shared = [x for x in testList if x in subList]
   return shared == subList

You could also convert the sublist to work with non-list inputs.

def contains(testList, subList):
   shared = [x for x in testList if x in subList]
   return shared == list(subList)

You can convert your lists into set() groups. Example:

A = set(A)
B = set(B)
print(A <= B)

You can subset a <= b method. Good workings

You can use collections.deque for an O(n) solution:

from collections import deque
def contains(a, b):
  b = deque(b)
  for i in a:
     if b and i == b[0]:
        _ = b.popleft()
  return not bool(b)

data = [([1, 2, 3], [1, 2]), ([1, 3, 2], [1, 2]), ([1, 3, 2], [1, 4]), ([1, 3, 2], [2, 1])]
print([contains(*i) for i in data])

Output

[True, True, False, False]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM