简体   繁体   中英

Matching strings with re.match doesn't work

From this link I used the following code:

my_other_string = 'the_boat_has_sunk'
my_list = ['car', 'boat', 'truck']
my_list = re.compile(r'\b(?:%s)\b' % '|'.join(my_list))
if re.match(my_list, my_other_string):
    print('yay')

However it doesn't work. I tried printing my_list after re.compile and it prints this:

re.compile('\\b(?:car|boot|truck)\\b')

What am I doing wrong?

re.match only matches the beginning of the input string to the regular expression. So this would only work for string beginning with the strings from my_list .

re.search on the other hand searches the entire string for a match to the regular expression.

import re

my_list = ['car', 'boat', 'truck']
my_other_string = 'I am on a boat'

my_list = re.compile(r'\b(?:%s)\b' % '|'.join(my_list))
if re.search(my_list, my_other_string):#changed function call here
    print('yay')

For the string "I am on a boat" , re.match will fail because the beginning of the string is "I" which doesn't match the regular expression. re.search will also not match the first charecter but will instead go through the string until it gets to "boat", at which point it will have found a match.

If we instead use the string "Boat is what I am on" , re.match and re.search will both match the regular expression to the string because the string now starts with a match.

This is not a regular sentence where words are joined with an underscore. Since you are just checking if the word is present, you may either remove \\b (as it is matching on a word boundary and _ is a word character!) or add alternatives:

import re
my_other_string = 'the_boat_has_sunk'
my_list = ['car', 'boat', 'truck']
my_list = re.compile(r'(?:\b|_)(?:%s)(?=\b|_)' % '|'.join(my_list))
if re.search(my_list, my_other_string):
    print('yay')

See IDEONE demo

EDIT :

Since you say it has to be true if one of the words in the list is in the string, not only as a separate word , but it musn't match if for example boathouse is in the string , I suggest first replacing non-word characters and _ with space, and then using the regex you had with \\b :

import re
my_other_string = 'the_boathouse_has_sunk'
my_list = ['car', 'boat', 'truck']
my_other_string = re.sub(r'[\W_]', ' ', my_other_string)
my_list = re.compile(r'\b(?:%s)\b' % '|'.join(my_list))
if re.search(my_list, my_other_string):
    print('yay')

This will not print yay , but if you remove house , it will.

See IDEONE Demo 2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM