简体   繁体   中英

Python regex: split by repeated punctuation marks

How????are!!!you

I'd like to split the string into ['How','are','you'] .

I've tried the following regex:

\?*|\!*

which does not work. However, the following regex works:

\?+|\!+

Anyone explain this to me?

>>> re.split(r'[?!]*', 'How????are!!!you')
['How', 'are', 'you']

As for why \\?*|\\!* doesn't work, just look at what re.findall finds:

>>> re.findall(r'\?*|\!*', 'How????are!!!you')
['', '', '', '????', '', '', '', '', '', '', '', '', '', '']

The alternation always takes the first branch if possible. re.split tries to only split by nonempty matches, so you end up splitting by ? but not ! (since \\?* will match any empty string, \\!* will never match in a non-overlapping fashion).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM