var TextToFind = "The Term is Fixed";
var TexToSearch = "<head>The</head> Term is Fixed";
The expression used now is (?mi)\\bThe Term is Fixed\\b
.
How can we modify this existing expression pattern to find the text with tags?
You can do this as follows
string str = "< head > The </ head > Term is Fixed";
string textWithoutTags = Regex.Replace(str, "<[^>]*>", string.Empty);
To match all the substrings you have with tags or whitespace between the words, you may dynamically construct a regex like
The(?>\s*<[^>]*>\s*|\s+)Term(?>\s*<[^>]*>\s*|\s+)is(?>\s*<[^>]*>\s*|\s+)Fixed
where each space is replaced with (?>\\s*<[^>]*>\\s*|\\s+)
pattern that matches either
\\s*<[^>]*>\\s*
- <
, then 0 or more chars other than <
and >
and then >
, enclosed with 0 or more whitespaces |
- or \\s+
- 1 or more whitespaces. See the regex demo
See the C# demo :
var TextToFind = "The Term is Fixed";
var TexToSearch = "<head>The</head> Term is Fixed\n<head>The</head> Term <span>is</span> Fixed";
var regex = string.Join(@"(?>\s*<[^>]*>\s*|\s+)", TextToFind.Split());
var result = Regex.Matches(TexToSearch, regex).Cast<Match>().Select(x => x.Value);
foreach (var s in result)
Console.WriteLine(s);
Output:
The</head> Term is Fixed
The</head> Term <span>is</span> Fixed
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.