简体   繁体   English

替换字符串中的第N个正则表达式匹配

[英]Replace Nth regex match occurrence in string

I know there are quite a few of these questions on SO, but I can't find one that explains how they implemented the pattern to return the N'th match, that was broken down. 我知道在SO上有很多这样的问题,但我找不到解释他们如何实现模式以返回第N个匹配的问题。 All the answers I looked just give the code to the OP with minimal explanation. 我看到的所有答案只是给OP提供了最少解释的代码。

What I know is, you need to implement this {X} in the pattern where the X is the number occurrence you want to return. 我所知道的是,你需要在模式中实现这个{X} ,其中X是你想要返回的数字。

So I am trying to match a string between two chars and I seemed to have been able to get that working. 所以我试图匹配两个chars之间的string ,我似乎已经能够使其工作。

The string to be tested looks something like this, 要测试的字符串看起来像这样,

"=StringOne&=StringTwo&=StringThree&=StringFour&"

"[^/=]+(?=&)"

Again, after reading as much as I could, this pattern will also return all matches, 再次,在尽可能多地阅读之后,这个模式也将返回所有匹配,

[^/=]+(?=&){1}

Due to {1} being the default and therefore redundant in the above pattern. 由于{1}是默认值,因此在上述模式中是多余的。 But I can't do this, 但我不能这样做,

[^/=]+(?=&){2}

As it will not return 3rd match as I was expecting it too. 因为它不会像我期待的那样返回第3场比赛。

So could someone please shove me in the right direction and explain how to get the pattern needed to find the occurrence of the match that will be needed? 那么有人可以把我推向正确的方向并解释如何获得所需的模式以找到所需的匹配事件吗?

A pure regex way is possible, but is not really very efficient if your pattern is complex. 纯正的正则表达式是可行的,但如果你的模式很复杂,那么效率并不高。

var s = "=StringOne&=StringTwo&=StringThree&=StringFour&";
var idx = 2;     // Replace this occurrence
var result = Regex.Replace(s, $@"^(=(?:[^=&]+&=){{{idx-1}}})[^=&]+", "${1}REPLACED");
Console.WriteLine(result); // => =StringOne&=REPLACED&=StringThree&=StringFour&

See this C# demo and the regex demo . 请参阅此C#演示正则表达式演示

在此输入图像描述

Regex details 正则表达式细节

  • ^ - start of string ^ - 字符串的开头
  • (=(?:[^=&]+&=){1}) - Group 1 capturing: (=(?:[^=&]+&=){1}) - 第1组捕获:
    • = - a = symbol = - a =符号
    • (?:[^=&]+&=){1} - 1 occurrence (this number is generated dynamically) of (?:[^=&]+&=){1} - 1次出现(这个数字是动态生成的)
    • [^=&]+ - 1 or more chars other than = and & ( NOTE that in case the string may contain = and & , it is safer to replace it with .*? and pass RegexOptions.Singleline option to the regex compiler) [^=&]+ - 除=&之外的1个或多个字符( 注意 ,如果字符串可能包含=& ,则用.*?替换它并将RegexOptions.Singleline选项传递给正则表达式编译器更安全)
    • &= - a &= substring. &= - a &=子字符串。
  • [^=&]+ - 1 or more chars other than = and & [^=&]+ - 除=&之外的1个或多个字符

The ${1} in the replacement pattern inserts the contents of Group 1 back into the resulting string. 替换模式中的${1}将组1的内容插回到结果字符串中。

As an alternative, I can suggest introducing a counter and increment is upon each match, and only replace the one when the counter is equal to the match occurrence you specify. 作为替代方案,我可以建议在每次匹配时引入计数器和增量,并且只有在计数器等于您指定的匹配项时才替换该计数器。

Use 采用

var s = "=StringOne&=StringTwo&=StringThree&=StringFour&";
var idx_to_replace = 2; // Replace this occurrence
var cnt = 0;            // Counter
var result = Regex.Replace(s, "[^=]+(?=&)", m => {  // Match evaluator
        cnt++; return cnt == idx_to_replace ? "REPLACED" : m.Value; });
Console.WriteLine(result); 
// => =StringOne&=REPLACED&=StringThree&=StringFour&

See the C# demo . 请参阅C#演示

The cnt is incremented inside the match evaluator inside Regex.Replace and m is assigned the current Match object. cntRegex.Replace内的匹配赋值器内递增,并为m分配当前的Match对象。 When cnt is equal to idx_to_replace the replacement occurs, else, the whole match is pasted back (with m.Value ). cnt等于idx_to_replace时,发生替换,否则,粘贴整个匹配(使用m.Value )。

Another approach is to iterate through the matches, and once the Nth match is found, replace it by splitting the string into parts before the match and after the match breaking out of the loop once the replacement is done: 另一种方法是迭代匹配,一旦找到第N个匹配,通过在匹配之前将字符串拆分为部分并在匹配完成替换之后将其替换为循环来替换它:

var s = "=StringOne&=StringTwo&=StringThree&=StringFour&";
var idx_to_replace = 2;     // Replace this occurrence
var cnt = 0;                // Counter
var result = string.Empty;  // Final result variable
var rx = "[^=]+(?=&)";      // Pattern
for (var m=Regex.Match(s, rx); m.Success; m = m.NextMatch())
{
    cnt++;
    if (cnt == idx_to_replace) {
        result = $"{s.Substring(0, m.Index)}REPLACED{s.Substring(m.Index+m.Length)}";
        break;
    }
}
Console.WriteLine(result); // => =StringOne&=REPLACED&=StringThree&=StringFour&

See another C# demo . 另一个C#演示

This might be quicker since the engine does not have to find all matches. 这可能会更快,因为引擎不必找到所有匹配项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM