简体   繁体   中英

Match a pipe character in the middle of a string with a python regex

I am trying to match a pipe character in a string using a Python regex and I can't seem to get it to match. I've boiled it down to a simplified version.

Let's say I am looking for the sequence z|a in a string. Here are some possible regexes and the results:

>>> import re
>>> re.match(r'|', 'xyz|abc')
<_sre.SRE_Match object at 0x2d9a850>
>>> re.match(r'z|', 'xyz|abc')
<_sre.SRE_Match object at 0x2d9a780>
>>> re.match(r'|a', 'xyz|abc')
<_sre.SRE_Match object at 0x2d9a850>
>>> re.match(r'z|a', 'xyz|abc')
>>> re.match(r'z\|a', 'xyz|abc')
>>> re.match(r'z\\|a', 'xyz|abc')
>>> re.match(r'z\\\|a', 'xyz|abc')
>>> re.match(r'z[|]a', 'xyz|abc')
>>> 

So I can match with | , |a and z| but I can't find a way to match z|a . Any ideas?

re.match() is looking for a match at the start of the string. Use re.search() instead.

The patterns you have that match are matching the empty string. ie r'|' is empty string or empty string, r'z|' is z or empty string and '|a' is empty string or a. all of those will match on any string.

>>> re.match('z\\|a', 'xyz|abc')
>>> re.search('z\\|a', 'xyz|abc')
<_sre.SRE_Match object at 0x02BF2BB8>
>>> re.search(r'z\|a', 'xyz|abc')
<_sre.SRE_Match object at 0x02BF2BF0>

More generally you can use re.escape() on a literal string that you need to include in the middle of a more complex regular expression to avoid having to figure out how many backslashes you need to unescape things.

You can use the following method to get re.match to match middle of the string.

myPattern = "how"

re.match('(.)*(%s)' %myPattern, 'Hello, how are you ?')*

The . matches anything in regex. Basically, you are asking for match to skip any number of characters needed to match your pattern.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM