简体   繁体   中英

Regular expression splitting by a specific pattern

I have a string str='\\n1. AA \\n2. BB\\n3.\\n4. CC' str='\\n1. AA \\n2. BB\\n3.\\n4. CC' str='\\n1. AA \\n2. BB\\n3.\\n4. CC' . I want to split it using the following pattern: a newline character followed by a digit followed by one or more space(s).

I am hoping to get the answer ['','AA ', 'BB\\n3.', 'CC'] .

If I use re.split('\\n[0-9]\\.\\s+',str) , I get the result:

['', 'AA ', 'BB', '4. CC']

What am I doing wrong?

\\s+ at the end matches whitespace including newline characters . If you don't want trailing newlines to match change it to [^\\S\\n]+ :

>>> re.split('\n[0-9]\.[^\S\n]+',s)
['', 'AA ', 'BB\n3.', 'CC']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM