I have following text
has helped discover and mentor such </br>
New York Times bestselling authors as Brandon Sanderson </br>
(Mistborn), James Dashner (The Maze Runner), and Stephenie
I am taking last 3 words of first line and first 3 words of last line to find in between text by using regex. I am using following regex in c# code.
string matchedText = "";
string RegexPattren = preLine + "[\\w\\W\\S\\s\\s\\D':;\"<>,.?]*" + postLine;
matchedText = Regex.Match(stBuilder.ToString(), RegexPattren).Value;
matchedText = preLine.Equals("") ? matchedText : matchedText.Replace(preLine, "");
matchedText = postLine.Equals("") ? matchedText : matchedText.Replace(postLine, "");
string[] MatchedLines = Regex.Split(matchedText, "</br>").Where(x => !string.IsNullOrEmpty(x.Trim())).ToArray();
string RegexPattren = preLine + "[\\w\\W\\S\\s\\s\\D':;\"<>,.?]*" + postLine;
which has followig values
and mentor such [\w\W\S\s\s\D':;"<>,.?]* James Dashner
Above code is working fine and matched result is
and mentor such </br>New York Times bestselling authors as Brandon Sanderson </br>(Mistborn), James Dashner
Problem occurs when words with brackets are found just like below, regex is not matching any text.
and mentor such [\w\W\S\s\s\D':;"<>,.?]* (Mistborn), James Dashner
How to match line which has text inside brackets before or after regex pattern in c# ?
You'll have to escape the parenthesis like
and mentor such [\w\W\S\s':;"<>,.?]*\(Mistborn\), James Dashner
That'll make it match the literal (
and )
.
And note that your regex had a space before (Mistborn)
which doesn't exist in the text. It's preceded by a newline. I removed the space, but you could also change it to a \\s
, which matches both space and newline.
And lastly, \\D
matches non numeric, which already is handled by \\W
since numbers are matched by \\w
. Actually, several of the characters in the class could be removed. If you set the RegexOptions.Singleline
you would probably be OK with
and mentor such .*\(Mistborn\), James Dashner
Check it out here at regex101 .
PS. There's a .NET method to escape regex'es, Regex.Escape
, but that complicates having actual regex patterns in there.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.