Find the occurrence of: any one of the substrings (whichever first) stored in a list; in a bigger string in Python

Question

I'm new to Python. I've gone through other answers.. I can say with some assurance that this may not be a duplicate.

Basically; let us say for example I want to find the occurrence of one of the substrings (stored in a list); and if found? I want it to stop searching for the other substrings of the list!

To illustrate more clearly;

a = ['This', 'containing', 'many']
string1 = "This is a string containing many words"

If you ask yourself, what is the first word in the bigger string string1 that matches with the words in the list a ? The answer will be This , because the first word in the bigger string string1 that has a match with list of substrings a is This

a = ['This', 'containing', 'many']
string1 = "kappa pride pogchamp containing this string this many words"

Now, I've changed string1 a bit. If you ask yourself, what is the first word in the bigger string string1 that matches with the words in the list a ? The answer will be containing , because the word containing is the first word that appears in the bigger string string1 that also has a match in the list of substrings a .

and if such a match is found? I want it to stop searching for any more matches!

I tried this:

string1 = "This is a string containing many words"

a = ['This', 'containing', 'many']

if any(x in string1 for x in a):

    print(a)

else:
    print("Nothing found")

The above code, prints the entire list of substrings. In other words, it checks for the occurrence of ANY and ALL of the substrings in the list a , and if found; it prints the entire list of substrings.

I've also tried looking up String find() method but I can't seem to understand how to exactly use it in my case

I'm looking for; to word it EXACTLY: The first WORD in the bigger string that matches any of the list of words in the substring and print that word.

or

to find WHICHEVER SUBSTRING (stored in a list of SUBSTRINGS) appears first in a BIGGER STRING and PRINT that particular SUBSTRING.

Answer 1

You could use a set membership check + next here.

>>> a = {'This', 'containing', 'many'}
>>> next((v for v in string1.split() if v in a), 'Nothing Found!')
'This'

This should give you (possibly better than) O(N) performance, since we're using next to find just the first value, and set membership tests are constant time.

Answer 2

I think this can be done without splitting the string1 instead by matching the elements of the list. For the first match use break to stop execution.

string1 = "This is a string containing many words"
a = ['This', 'containing', 'many']

for x in a:
    if x in string1:
        print(x)
        break
else:
    print("Nothing found")

List comprehension

l=[x for x in a if x in string1]
if l:
    print(l[0])
else:
    print("Nothing found")

Answer 3

You can use re here.

import re
a = ['This', 'containing', 'many']
string1 = "kappa pride pogchamp containing this string this many words"
print re.search(r"\b(?:"+"|".join(a)+r")\b", string1).group()

Output:

containing


s="""
a = ['This', 'containing', 'many']
a=set(a)
string1 = 'is a string containing many words This '
c=next((v for v in string1.split() if v in a), 'Nothing Found!')
"""
s1="""
a = ['This', 'containing', 'many']
string1 = "is a string containing many words This "
re.search(r"\b(?:"+"|".join(a)+r")\b", string1)
"""
print timeit.timeit(stmt=s,number=1000000)
print timeit.timeit(stmt=s1,number=1000000, setup="import re")

Answer 4

There are two ways you could approach this. One is using the

string.find('substring')

method that will return the index of the first occurence of 'substring' in string1, or presumably return -1 if there is no occurence of 'substring' in string1. By iterating over the list of search terms a, you would have a collection of indicies, each corresponding to one word in your list. The smallest non-negative_one value in your list would be the index of your first word. This is very complex but would not require any sort of loop over the actual string.

Another alternative would be to use

string1.split(' ')

to create a list of all of the words in the string. Then you could go through this list with a for each loop and check if each item in your string1 list corresponds to any of the other items. This would be a great learning opportunity to try on your own, but let me know if I was too vague or if code would be more helpful.

Hope this helps!

Answer 5

a = ['This', 'containing', 'many']
string1 = "kappa pride pogchamp containing this string this many words"

Break is better option but that solution is already there so i wanted to show you can do in with slice too:

print("".join([item for item in string1.split() if item in a][:1]))

Above list comprehension is same as:

new=[]
for item in string1.split():
    if item in a:
        new.append(item)

print("".join(new[:1]))

Answer 6

a = ['This', 'containing', 'many']
string1 = "kappa pride pogchamp containing this string this many words"

newList = string1.split(" ");
for i in newList:
    if i in a:
        print(i);
        break

This will do.

For more read this. https://docs.python.org/2/library/string.html

Find the occurrence of: any one of the substrings (whichever first) stored in a list; in a bigger string in Python

Question

6 answers

solution1
2 ACCPTED 2017-11-10 06:46:47

solution2
1 2017-11-10 06:41:01

solution3
1 2017-11-10 06:42:52

solution4
0 2017-11-10 06:34:36

solution5
0 2017-11-10 08:12:36

solution6
-1 2017-11-10 06:34:43

Find the occurrence of: any one of the substrings (whichever first) stored in a list; in a bigger string in Python

Question

6 answers

solution1 2 ACCPTED 2017-11-10 06:46:47

solution2 1 2017-11-10 06:41:01

solution3 1 2017-11-10 06:42:52

solution4 0 2017-11-10 06:34:36

solution5 0 2017-11-10 08:12:36

solution6 -1 2017-11-10 06:34:43

solution1
2 ACCPTED 2017-11-10 06:46:47

solution2
1 2017-11-10 06:41:01

solution3
1 2017-11-10 06:42:52

solution4
0 2017-11-10 06:34:36

solution5
0 2017-11-10 08:12:36

solution6
-1 2017-11-10 06:34:43