简体   繁体   中英

How to split string preserving spaces and any number of \n characters

I want to split the string and create a collection, with the following rules: The string should be splitted into words.

1) If the string contains '\\n' it should be considered as a seperate '\\n' word.
2) If the string contains more than one '\\n' it should considered it as more than on '\\n' words.
3) No space should be removed from the string. Only exception is, if space comes between two \\n it can be ignored.

PS: I tried a lot with string split, first split-ted \\n characters and created a collection, downside is, if I have two \\n consecutively, I'm unable to create two dummy words into the collection. Any help would be greatly appreciated.

在此处输入图片说明

Is there anyway to do this using regex?

Looks like homework. As such, read up on \\b .

Should set you in the right direction.

Read up on the zero-width assertions . With them you can define a split position between eg \\s and \\S without actually matching either adjacent character.

edit: Here's another question where the OP asked about those constructs.

Split with a regex like this:

(?<=[\S\n])(?=\s)

Something like:

var substrings = Regex.Split(input, @"(?<=[\S\n])(?=\s)");

This will not remove any spaces at all, but that was not required so should be fine.

If you really want the spaces between \\n s to be removed, you could split with something like:

(?<=[\S\n])(?=\s)(?:[ \t]+(?=\n))?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM