简体   繁体   中英

Find single or multiple words in string

I'm trying to code a script that will find a single word or a string composed of multiple single words in a given string. I've found this answer which looks very much what I'd need, but I can't really understand how it works.

Using the code provided in the answer mentioned above, I have this:

import re

def findWholeWord(w):
    return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search

st1 = 'those who seek shall find'
st2 = 'swordsmith'

print findWholeWord('seek')(st1)          # -> <match object>
print findWholeWord('these')(st1)         # -> None
print findWholeWord('THOSE')(st1)         # -> <match object>
print findWholeWord('seek shall')(st1)    # -> <match object>
print findWholeWord('word')(st2)          # -> None

This function returns either something like <_sre.SRE_Match object at 0x94393e0> (when the word(s) were found) or None (when they weren't) and I'd like the function to return instead either True or False if the word(s) were found or not, respectively. Since I'm not clear on how the function is working, I'm not sure how I'd do that.

I've never seen a function being called passing two variables (?), ie: findWholeWord(word)(string) , what is this doing?

re is the regular expression module. findWholeWord creates a regular expression object that will match the word (pattern) you pass it. findWholeWord returns a function; the search method of the regular expression object - notice the absence of the '()' at the end of the return statement.

import re
def findWholeWord(w):
    return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search

>>> search = findWholeWord('seek')
>>> print search
<built-in method search of _sre.SRE_Pattern object at 0x032F70A8>
>>>

re.search returns a match object if the pattern is found or None if it is not. match objects evaluate to True.

>>> search = findWholeWord('seek')
>>> print search
<built-in method search of _sre.SRE_Pattern object at 0x032F70A8>
>>> 
>>> match = search('this string contains seek')
>>> print match, bool(match)
<_sre.SRE_Match object at 0x032FDC20> True
>>> match = search('this string does not contain the word you are looking for')
>>> print match, bool(match)
None False
>>>

In your example, findWholeWord('seek')(st1) is calling the search method of a regular expression that matches `seek' and passing it the string st1 .

>>> st1 = 'those who seek shall find'
>>> match = search(st1)
>>> print match, bool(match)
<_sre.SRE_Match object at 0x032FDC20> True
>>> match = findWholeWord('seek')(st1)
>>> print match, bool(match)
<_sre.SRE_Match object at 0x032FDC60> True
>>> 
if findWholeWord('seek')(st1) == None:
    return False
else:
    return True

Or:

if findWholeWord('seek')(st1): #this is evaluated 'True'
        #do something
else:
        #there is no search match, do something else

Or:

import re

def findWholeWord(w, string):
    pattern = re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE)
    if pattern.search(string):
        return True
    else:
        return False

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM