简体   繁体   中英

Check if a subset of characters in a list of strings is contained in another list of strings

So I have two list of strings. Those strings are formed by a sorted combination of one or more different characters. The characters are not all in the alphabet but are given.

Let's say, all the possible characters are [A, B, C, D, E], then the two lists have a combination of those elements (from 1 up to 5 in this case).

Example:

list1 = [AB, AB, C]
list2 = [ABC, CD, ABCDE, E]

The number of elements in each list is not defined, but can range from 1 to 30, with the general case being around 10.

Now, what I want is to tell if there is at least one combination of unique characters per string in list1 that also exists in list2 , regardless order. In the example, [A, A, C] is contained in list2 with [A, C, A, E].

The naive way I found to do this is doing all the possible 1 character combinations from each list and see if exists at least one case where list1 is contained in list2 . But this can grow exponentially as all possible combinations of a 10 element list of 5-characters strings can be huge (and that's only the general case).

I have thought of using regular expressions or something like that, but I am really not picturing a more efficient solution.

I am using Python for this. Just in case is relevant because of an existing solution or library.

Thank you for your help!

This may be a prime candidate for set operations. Lets take your example (notice, we needed to add quotes to make them strings).

list1 = ["AB", "AB", "C"]
list2 = ["ABC", "CD", "ABCDE", "E"]

If we want a set with unique elements from both list1 and list2

print(set(list1) | set(list2))
#OUTPUT: {'C', 'AB', 'ABCDE', 'CD', 'ABC', 'E'}

If we want to check what elements are common in both list1 and list2 (If we were to add "C" to list2 we would have an output of {'C'} otherwise, there are no common elements shared which results in an empty set() )

print(set(list1) & set(list2)) 
#OUTPUT: set()

If we want the elements that are in list1 but not in list2

print(set(list1) - set(list2))
#OUTPUT: {'C', 'AB'}

If we want a set with elements that are either in list1 or list2

print(set(list1) ^ set(list2)) 
#OUTPUT: {'E', 'CD', 'AB', 'ABC', 'C', 'ABCDE'}

For more information you can check out https://docs.python.org/2/library/sets.html

I hope this helped!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM