I'm trying to figure out how to split a string, keeping the delimiters, except when the delimiter is followed by a space. I seem to be most of the way there, except that the character immediately following the delimiter is retained with the delimiter.
What I have so far is the following:
>>> s='\nm222 some stuff \n more stuff'
>>> re.split('(\n[^ ])',s)
['', '\nm', '222 some stuff \n more stuff']
The result i need is
['', '\n', 'm222 some stuff \n more stuff']
What am I missing here? Thanks for the help.
Use a negative lookahead:
>>> s='\nm222 some stuff \n more stuff'
>>> re.split(r'(\n(?! ))', s)
['', '\n', 'm222 some stuff \n more stuff']
Your code,
re.split('(\n[^ ])',s)
Doesn't work because (\\n[^ ])
puts the "not a space" character in the same capturing group as \\n
, giving you \\nm
. (\\n(?! ))
avoids consuming the "not a space" character, placing it in the next capturing group but still using it to split.
You can read more about lookaheads on the python regex documentation page .
Use \\n(?! )
. This is a negative lookahead
This will ensure the \\n
is not followed by a space
If you wanted, you could even use \\n(?!\\s)
. \\s
includes a variety of whitespace characters like
' '
(a single space) \\t
(tab) \\n
(newline) \\r
(carriage return) 你需要一个先行断言。
re.split('(\n(?=[^ ]))', s)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.