Regx Match Expression to find by exlcuding html tags

Question

var TextToFind  = "The Term is Fixed";
var TexToSearch = "<head>The</head> Term is Fixed";

The expression used now is (?mi)\\bThe Term is Fixed\\b .

How can we modify this existing expression pattern to find the text with tags?

Answer 1

You can do this as follows

string str = "< head > The </ head > Term is Fixed";
string textWithoutTags = Regex.Replace(str, "<[^>]*>", string.Empty);

Answer 2

To match all the substrings you have with tags or whitespace between the words, you may dynamically construct a regex like

The(?>\s*<[^>]*>\s*|\s+)Term(?>\s*<[^>]*>\s*|\s+)is(?>\s*<[^>]*>\s*|\s+)Fixed

where each space is replaced with (?>\\s*<[^>]*>\\s*|\\s+) pattern that matches either

\\s*<[^>]*>\\s* - < , then 0 or more chars other than < and > and then > , enclosed with 0 or more whitespaces
| - or
\\s+ - 1 or more whitespaces.

See the regex demo

See the C# demo :

var TextToFind  = "The Term is Fixed";
var TexToSearch = "<head>The</head> Term is Fixed\n<head>The</head> Term <span>is</span> Fixed";
var regex = string.Join(@"(?>\s*<[^>]*>\s*|\s+)", TextToFind.Split());
var result = Regex.Matches(TexToSearch, regex).Cast<Match>().Select(x => x.Value);
foreach (var s in result)
    Console.WriteLine(s);

Output:

The</head> Term is Fixed
The</head> Term <span>is</span> Fixed

Regx Match Expression to find by exlcuding html tags

Question

2 answers

solution1
0 2020-08-27 07:07:27

solution2
0 2020-08-27 09:14:13

Regx Match Expression to find by exlcuding html tags

Question

2 answers

solution1 0 2020-08-27 07:07:27

solution2 0 2020-08-27 09:14:13

solution1
0 2020-08-27 07:07:27

solution2
0 2020-08-27 09:14:13