简体   繁体   English

如何替换部分未知字符串

[英]How To replace an partial unknown string


I Need to replace (or better delete) a string, where I know the beginning and the end.我需要替换(或更好地删除)一个我知道开头和结尾的字符串。
Some Characters are unknown, also the length of the string.有些字符是未知的,字符串的长度也是未知的。
Of Course I could work with substring and other c# string-operations but isn't there a simple replace Wildcard Option?当然,我可以使用 substring 和其他 c# 字符串操作,但没有简单的替换通配符选项吗?

mystring.Replace("O(*)", "");

Would be a nice Option.将是一个不错的选择。
I know that the string Begins with O( and Ends with ) .我知道字符串以O(和以)开头。
It's possible than the String Looks like O(something);QG(anything else)这可能比字符串看起来像O(something);QG(anything else)
Here the result should be ;QG(anything else)这里的结果应该是;QG(anything else)

Is this possible with a simple replace?这可以通过简单的替换来实现吗?
And what About the advanced Option, that he string exists more than one time like here:至于高级选项,他的字符串不止一次存在,如下所示:
O(something);O(someone);QG(anything else)

Take a look at regular expressions.看看正则表达式。

The following will meet this case:以下将满足这种情况:

var result = Regex.Replace(originalString, @"O\(.*?\)", "");

What it means:这是什么意思:

  • @ - switch off C# interpreting \ as escape, because otherwise the compiler will see our \( and try to replace it with another char like it does for \n becoming a newline (and there is no \( so it's a compiler error). Regex also uses \ as an escape char, so without the @ to get a slash into the string for regex to interpret as a slash to perform a regex escape, it needs a double C# slash, and that can make regex patterns more confusing @ - 关闭 C# 将 \ 解释为转义,因为否则编译器将看到我们的\(并尝试用另一个字符替换它,就像 \n 成为换行符一样(并且没有\(所以这是编译器错误)。正则表达式还使用 \ 作为转义字符,因此如果没有 @ 将斜杠插入字符串以将正则表达式解释为斜杠以执行正则表达式转义,它需要双 C# 斜杠,这会使正则表达式模式更加混乱
  • " start of c# string " c# 字符串的开始
  • O\( literal character O followed by literal character ( - brackets have special meaning in regex, so backslash disables special meaning) O\(文字字符O后跟文字字符( - 括号在正则表达式中具有特殊含义,因此反斜杠禁用特殊含义)
  • .*? match zero or more of any character (lazy/pessimistic)匹配零个或多个任意字符(懒惰/悲观)
  • \) literal ) \)字面量 )
  • " end of string "字符串结尾

.*? is a complex thing warrants a bit more explanation:是一个复杂的事情需要更多的解释:

In regex .在正则表达式. means "match any single character", and * means "zero or more of the previous character".表示“匹配任何单个字符”, *表示“前一个字符的零个或多个”。 In this way .* means "zero or more of any character".这样.*表示“零个或多个任意字符”。

So what's the ?那是什么? for?为了?

By default regex * is "greedy" - a .* with eat the entire input string and then start working backwards, spitting characters back out, and checking for a match.默认情况下,正则表达式*是“贪婪的” - 一个.*吃掉整个输入字符串,然后开始向后工作,吐出字符,并检查匹配。 If you had 2 in succession like you put:如果您像您所说的那样连续有2个:

K(hello);O(mystring);O(otherstring);L(byebye)

And you match it greedily, then O\(.*\) will match the initial O(, then consume all the input, then spit one trailing ) back out and declare it's found a match, so the .* matches mystring);O(otherstring;L(byebye然后你贪婪地匹配它,然后O\(.*\)将匹配初始 O(,然后消耗所有输入,然后吐出一个尾随 ) 并声明它找到匹配,所以.*匹配mystring);O(otherstring;L(byebye

We don't want this.我们不想要这个。 Instead we want it to work forwards a character at a time, looking for a matching ) .相反,我们希望它一次转发一个字符,寻找匹配的) Putting the ?? after the * changes from greedy mode to pessimistic(/lazy) mode, and the input is scanned forwards rather than zipping to the end and scanning backwards.*从贪婪模式变为悲观(/懒惰)模式之后,输入被向前扫描,而不是压缩到最后并向后扫描。 This means that O\(.*?) matches mystring and then later otherstring , leaving a result of K(hello);;;L(byebye) , rather than K(hello);这意味着O\(.*?)匹配mystring然后是otherstring ,留下K(hello);;;L(byebye)的结果,而不是K(hello);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM