简体   繁体   中英

Get substring from string in C# using Regular Expression

I have a string like:

Brief Exercise 1-1 Types of Businesses Brief Exercise 1-2 Forms of Organization Brief Exercise 1-3 Business Activities.

I want to break above string using regular expression so that it can be like:

Types of Businesses
Forms of Organization
Business Activities.

Please don't say that I can break it using 1-1, 1-2 and 1-3 because it will bring the word "Brief Exercise" in between the sentences. Later on I can have Exercise 1-1 or Problem 1-1 also. So I want some general Regular expression.

Any efficient regular expression for this scenario ?

var regex=new Regex(@"Brief (?:Exercise|Problem) \d+-\d+\s");
var result=string.Join("\n",regex.Split(x).Where(a=>!string.IsNullOrEmpty(a)));

The regex will match "Brief " followed by either "Exercise" or "Problem" (the ?: makes the group non capturing), followed by a space, then 1 or more digits then a "-", then one or more digits then a space.

The second statement uses the split function to split the string into an array and then regex to skip all the empty entries (otherwise the split would include the empty string at the begining, you could use Skip(1) instead of Where(a=>!string.IsNullOrEmpty(a)) , and then finally uses string.Join to combine the array back into string with \\n as the seperator.

You could use regex.Replace to convert directly to \\n but you will end up with a \\n at the begining that you would have to strip.

--EDIT---

if the fist number is always 1 and the second number is 1-50ish you could use the following regex to support 0-59

var regex=new Regex(@"Brief (?:Exercise|Problem) 1-\[1-5]?\d\s");

This regular expression will match on "Brief Exercise 1-" followed by a digit and an optional second digit:

@"Brief Exercise 1-\d\d?"

Update:

Since you might have "Problem" as well, an alternation between Exercise and Problem is also needed (using non capturing parenthesis):

@"Brief (?:Exercise|Problem) 1-\d\d?"

Why don't you do it the easy way? I mean, if the regular part is "Brief Exercise #-#" Replace it by some split character and then split the resulting string to obtain what you want.

If you do it otherwise you will always have to take care of special cases.

string pattern = "Brief Exercise \d+-\d+";
Regex reg = new Regex(patter);
string out = regex.replace(yourstring, "|");
string results[] = out.split("|");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM