简体   繁体   中英

Check if a string contains the list elements

How to check if a string contains the elements in a list?

str1 = "45892190"
lis = [89,90]
str1 = "45892190"
lis = [89,90]

for i in lis:
    if str(i) in str1:
        print("The value " + str(i) + " is in the list")

OUTPUT:

The value 89 is in the list

The value 90 is in the list

If you want to check if all the values in lis are in str1, the code of cricket_007

all(str(l) in str1 for l in lis)
out: True

is what you are looking for

If no overlap is allowed, this problem becomes much harder than it looks at first. As far as I can tell, no other answer is correct (see test cases at the end).

Recursion is needed because if a substring appears more than once, using one occurence instead of the other could prevent other substrings to be found.

This answer uses two functions. The first one finds every occurence of a substring in a string and returns an iterator of strings where the substring has been replaced by a character which shouldn't appear in any substring.

The second function recursively checks if there's any way to find all the numbers in the string:

def find_each_and_replace_by(string, substring, separator='x'):
    """
    list(find_each_and_replace_by('8989', '89', 'x'))
    # ['x89', '89x']
    list(find_each_and_replace_by('9999', '99', 'x'))
    # ['x99', '9x9', '99x']
    list(find_each_and_replace_by('9999', '89', 'x'))
    # []
    """
    index = 0
    while True:
        index = string.find(substring, index)
        if index == -1:
            return
        yield string[:index] + separator + string[index + len(substring):]
        index += 1


def contains_all_without_overlap(string, numbers):
    """
    contains_all_without_overlap("45892190", [89, 90])
    # True
    contains_all_without_overlap("45892190", [89, 90, 4521])
    # False
    """
    if len(numbers) == 0:
        return True
    substrings = [str(number) for number in numbers]
    substring = substrings.pop()
    return any(contains_all_without_overlap(shorter_string, substrings)
               for shorter_string in find_each_and_replace_by(string, substring, 'x'))

Here are the test cases:

tests = [
    ("45892190", [89, 90], True),
    ("8990189290", [89, 90, 8990], True),
    ("123451234", [1234, 2345], True),
    ("123451234", [2345, 1234], True),
    ("123451234", [1234, 2346], False),
    ("123451234", [2346, 1234], False),
    ("45892190", [89, 90, 4521], False),
    ("890", [89, 90], False),
    ("8989", [89, 90], False),
    ("8989", [12, 34], False)
]

for string, numbers, should in tests:
    result = contains_all_without_overlap(string, numbers)
    if result == should:
        print("Correct answer for %-12r and %-14r (%s)" % (string, numbers, result))
    else:
        print("ERROR : %r and %r should return %r, not %r" %
              (string, numbers, should, result))

And the corresponding output:

Correct answer for '45892190'   and [89, 90]       (True)
Correct answer for '8990189290' and [89, 90, 8990] (True)
Correct answer for '123451234'  and [1234, 2345]   (True)
Correct answer for '123451234'  and [2345, 1234]   (True)
Correct answer for '123451234'  and [1234, 2346]   (False)
Correct answer for '123451234'  and [2346, 1234]   (False)
Correct answer for '45892190'   and [89, 90, 4521] (False)
Correct answer for '890'        and [89, 90]       (False)
Correct answer for '8989'       and [89, 90]       (False)
Correct answer for '8989'       and [12, 34]       (False)

If you want non-overlapping matches I'd do it like this:

  • create a copy of the initial string (as we'll modify it)
  • go through each element of the list and if we find the element in our string, we replace it with x
  • at the same time, if we find the number in our string, we increment a counter
  • at the end, if the variable equals the length of the list, it means that all of its elements are there
str1 = "45890190"
lis1 = [89, 90]

copy, i = str1, 0
for el in lis1:
    if str(el) in copy:
        copy = copy.replace(str(el), 'x')
        i = i + 1

if i == len(lis1):
    print(True)

More, we don't really need a counter if we add an extra condition which will return False when an element isn't found in the string. That is, we get to the following, final solution:

def all_matches(_list, _string):
    str_copy = _string
    for el in _list:
        if str(el) not in str_copy:
            return False
        str_copy = str_copy.replace(str(el), 'x')
    return True

Which you can test by writing:

 str1 = "4589190" lis1 = [89, 90] print(all_matches(lis1, str1)) > True

This might not be the best solution for what you're looking, but I guess it serves the purpose.

You can use all() function

In [1]: str1 = "45892190"
   ...: lis = [89,90]
   ...: all(str(l) in str1 for l in lis)
   ...:
Out[1]: True
def contains(s, elems):
    for elem in elems:
        index = s.find(elem)
        if index == -1:
            return False
        s = s[:index] + s[index + len(elem) + 1:]
    return True

Usage:

>>> str1 = "45892190"
>>> lis = [89,90]
>>> contains(str1, (str(x) for x in lis))
True
>>> contains("890", (str(x) for x in lis))
False

You can use the regular expression to search.

import re
str1 = "45892190"
lis = [89,90]
for i in lis:
  x = re.search(str(i), str1)
  print(x)

It is possible to implement this correctly using regular expressions. Generate all unique permutations of the input, for each permutation connect the terms with ".*" then connect all of the permutations with "|". For example, [89, 90, 8990] gets turned into 89.*8990.*90| 89.*90.*8990| 8990.*89.*90| 8990.*90.*89| 90.*89.*8990| 90.*8990.*89 , where I added a space after each "|" for clarity."

The following passes Eric Duminil's test suite.

import itertools
import re

def create_numbers_regex(numbers):
    # Convert each into a string, and double-check that it's an integer
    numbers = ["%d" % number for number in numbers]

    # Convert to unique regular expression terms
    regex_terms = set(".*".join(permutation)
                            for permutation in itertools.permutations(numbers))
    # Create the regular expression. (Sorted so the order is invariant.)
    regex = "|".join(sorted(regex_terms))
    return regex

def contains_all_without_overlap(string, numbers):
    regex = create_numbers_regex(numbers)
    pat = re.compile(regex)
    m = pat.search(string)
    if m is None:
        return False
    return True

However, and this is a big however, the regular expression size, in the worst case, grows as the factorial of the number of numbers. Even with only 8 unique numbers, that's 40320 regex terms. It takes Python several seconds just to compile that regex.

The only time where this solution might be useful is if you have a handful of numbers and you wanted to search a lot of strings. In that case, you might also look into re2, which I believe could handle that regex without backtracking.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM