简体   繁体   English

C#正则表达式

[英]C# Regular Expressions

I have a string that has multiple regular expression groups, and some parts of the string that aren't in the groups. 我有一个包含多个正则表达式组的字符串,以及不在组中的字符串的某些部分。 I need to replace a character, in this case ^ only within the groups, but not in the parts of the string that aren't in a regex group. 我需要替换一个字符,在这种情况下^仅在组内,而不是在字符串中不在正则表达式组中的部分。

Here's the input string: 这是输入字符串:

STARTDONTREPLACEME^ENDDONTREPLACEME~STARTREPLACEME^ENDREPLACEME~STARTREPLACEME^BLAH^ENDREPLACEME~STARTDONTREPLACEME^BLAH^ENDDONTREPLACEME~

Here's what the output string should look like: 这是输出字符串应该是什么样子:

STARTDONTREPLACEME^ENDDONTREPLACEME~STARTREPLACEMEENDREPLACEME~STARTREPLACEMEBLAHENDREPLACEME~STARTDONTREPLACEME^BLAH^ENDDONTREPLACEME~

I need to do it using C# and can use regular expressions. 我需要使用C#来完成它并且可以使用正则表达式。

I can match the string into groups of those that should and shouldn't be replaced, but am struggling on how to return the final output string. 我可以将字符串匹配到那些应该和不应该被替换的组中,但我正在努力研究如何返回最终的输出字符串。

I'm not sure I get exactly what you're having trouble with, but it didn't take long to come up with this result: 我不确定我到底遇到了什么问题,但是没过多久就得出这个结果:

string strRegex = @"STARTREPLACEME(.+)ENDREPLACEME";
RegexOptions myRegexOptions = RegexOptions.None;
Regex myRegex = new Regex(strRegex, myRegexOptions);
string strTargetString = @"STARTDONTREPLACEME^ENDDONTREPLACEME~STARTREPLACEME^ENDREPLACEME~STARTREPLACEME^BLAH^ENDREPLACEME~STARTDONTREPLACEME^BLAH^ENDDONTREPLACEME~";
string strReplace = "STARTREPLACEMEENDREPLACEME";

return myRegex.Replace(strTargetString, strReplace);

By using my favorite online Regex tool: http://regexhero.net/tester/ 使用我最喜欢的在线Regex工具: http//regexhero.net/tester/

Is that helpful? 这有用吗?

Regex rgx = new Regex(
  @"\^(?=(?>(?:(?!(?:START|END)(?:DONT)?REPLACEME).)*)ENDREPLACEME)");

string s1 = rgx.Replace(s0, String.Empty);

Explanation: Each time a ^ is found, the lookahead scans ahead for an ending delimiter ( ENDREPLACEME ). 说明:每次找到^ ,前瞻扫描前方的结束分隔符( ENDREPLACEME )。 If it finds one without seeing any of the other delimiters first, the match must have occurred inside a REPLACEME group. 如果它找到一个没有首先看到任何其他分隔符,则匹配必须发生在REPLACEME组内。 If the lookahead reports failure, it indicates that the ^ was found either between groups or within a DONTREPLACEME group. 如果前瞻报告失败,则表示在组之间或DONTREPLACEME组内找到了^

Because lookaheads are zero-width assertions, only the ^ will actually be consumed in the event of a successful match. 因为前瞻是零宽度断言,所以在成功匹配的情况下实际上只会消耗^

Be aware that this will only work if delimiters are always properly balanced and groups are never nested within other groups. 请注意,只有在分隔符始终正确平衡且组永远不会嵌套在其他组中时,这才有效。

If you are able to separate into groups that should be replaced and those that shouldn't, then instead of providing a single replacement string, you should be able to use a MatchEvaluator (a delegate that takes a Match and returns a string) to make the decision of which case it is currently dealing with and return the replacement string for that group alone. 如果你能够分成应该被替换的组和那些不应该被替换的组,那么你应该能够使用MatchEvaluator(一个获取Match并返回一个字符串的委托)而不是提供单个替换字符串。决定它当前处理哪种情况并单独返回该组的替换字符串。

You may also use an additional regex inside the MatchEvaluator. 您还可以在MatchEvaluator中使用其他正则表达式。 This solution produces the expected output: 此解决方案产生预期输出:

Regex outer = new Regex(@"STARTREPLACEME.+ENDREPLACEME", RegexOptions.Compiled);
Regex inner = new Regex(@"\^", RegexOptions.Compiled);

string replaced = outer.Replace(start, m =>
{
    return inner.Replace(m.Value, String.Empty);
});

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM