[英]C# Regex find string between two different pairs of strings
Using C# RegEx, I am trying to find text enclosed by two distinct pairs of words, say, start1....end1, and start2...end2. 使用C#RegEx,我试图找到由两个不同的单词对包围的文本,比如start1 .... end1和start2 ... end2。 In my example below I would like to get: text1, text2, text11, text22. 在我下面的例子中,我想得到:text1,text2,text11,text22。
string str = "This start1 text1 end1. And start2 text2 end2 is a test. This start1 text11 end1. And start2 text22 end2 is a test.";
Regex oRegEx = new Regex(@"start1(.*?)end1|start2(.*?)end2", RegexOptions.IgnoreCase);
MatchCollection oMatches = oRegEx.Matches(sHTML);
if (oMatches.Count > 0)
{
foreach (Match mt in oMatches)
{
Console.WriteLine(mt.Value); //the display includes the start1 and end1 (or start2 and end2)
Console.WriteLine(mt.Groups[1].Value); //the display excludes the start1 and end1 (or start2 and end2) or displays an empty string depending on the order of pattern.
}
}
mt.Groups[1].Value
in the above code correctly displays text1, text11 if the pattern is @"start1(.*?)end1|start2(.*?)end2"
but it displays empty strings for text2, and text22. mt.Groups[1].Value
。上面代码中的值正确显示text1,text11如果模式是@"start1(.*?)end1|start2(.*?)end2"
但它显示text2和text22的空字符串。 On the other hand if I change order in the pattern to @"start2(.*?)end2|start1(.*?)end1"
, it correctly displays text2, text22 but displays empty strings for text1 and text11. 另一方面,如果我将模式中的顺序更改为@"start2(.*?)end2|start1(.*?)end1"
,它会正确显示text2,text22但显示text1和text11的空字符串。 What needs to change in my code? 我的代码中需要更改什么? This MSDN article explains something about when a group returns empty string but I am still not getting the desired results. 这篇MSDN文章解释了一个组何时返回空字符串,但我仍然没有得到所需的结果。
Give name to group. 给组分名。
start1(?<val>.*?)end1|start2(?<val>.*?)end2
And get value as: 获得价值:
mt.Groups["val"].Value
The original problem is that without names the group between start1
and end1
has index 1
, and group between start2
and end2
has index 2
, as you can see from the following picture: 原来的问题是,没有名字的组start1
和end1
具有指数1
之间,和组start2
和end2
具有指数2
,你可以从下面的图片中看到:
Or another solution is to use regex like: 或者另一种解决方案是使用正则表达式:
(?<=start([12])).*?(?=end\1)
And then in your code: 然后在你的代码中:
Console.WriteLine(mt.Value);
will display the required content. 将显示所需的内容。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.