简体   繁体   中英

Split on delimiter except when followed by space

I'm trying to figure out how to split a string, keeping the delimiters, except when the delimiter is followed by a space. I seem to be most of the way there, except that the character immediately following the delimiter is retained with the delimiter.

What I have so far is the following:

>>> s='\nm222 some stuff \n more stuff'
>>> re.split('(\n[^ ])',s)
['', '\nm', '222 some stuff \n more stuff']

The result i need is

['', '\n', 'm222 some stuff \n more stuff']

What am I missing here? Thanks for the help.

Use a negative lookahead:

>>> s='\nm222 some stuff \n more stuff'
>>> re.split(r'(\n(?! ))', s)
['', '\n', 'm222 some stuff \n more stuff']

Your code,

re.split('(\n[^ ])',s)

Doesn't work because (\\n[^ ]) puts the "not a space" character in the same capturing group as \\n , giving you \\nm . (\\n(?! )) avoids consuming the "not a space" character, placing it in the next capturing group but still using it to split.

You can read more about lookaheads on the python regex documentation page .

Use \\n(?! ) . This is a negative lookahead

This will ensure the \\n is not followed by a space


If you wanted, you could even use \\n(?!\\s) . \\s includes a variety of whitespace characters like

  • ' ' (a single space)
  • \\t (tab)
  • \\n (newline)
  • \\r (carriage return)

你需要一个先行断言。

re.split('(\n(?=[^ ]))', s)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM