[英]Is there a regex double-quote matching syntax that can be used without modification in C#?
The following simple regex includes four double-quotes that must be matched.以下简单的正则表达式包含四个必须匹配的双引号。 I'm not attempting to come up with a solution for this particular regex but am merely using it as a general example:
我并没有试图为这个特定的正则表达式提出解决方案,而只是将其用作一般示例:
\s*"Hello"\s*"world"\s*
The problem I've always encountered when writing C# code that contains regexes that must match double-quotes is the cumbersome syntax I've had to use because string literals in C# are double-quote delimited.在编写包含必须匹配双引号的正则表达式的 C# 代码时,我一直遇到的问题是我不得不使用的繁琐语法,因为 C# 中的字符串文字是双引号分隔的。 I've used the two different techniques below, neither of which I like.
我使用了以下两种不同的技术,我都不喜欢这两种技术。 Aside from the additional complexity required to butcher the original regex into acceptable C# syntax, converting that syntax back into the original regex for additional development is a real pain.
除了将原始正则表达式转换为可接受的 C# 语法所需的额外复杂性之外,将该语法转换回原始正则表达式以进行额外开发是一个真正的痛苦。 Is there any form that would be equally acceptable to both the regex engine and the C# language parser?
正则表达式引擎和 C# 语言解析器是否同样可以接受任何形式?
The first hack uses escape characters to escape the backslashes and double quotes that must appear literally in the regex.第一个 hack 使用转义字符来转义必须出现在正则表达式中的反斜杠和双引号。 I view this as the most error prone approach because you get buried in backslashes for more complex regexes:
我认为这是最容易出错的方法,因为您会陷入更复杂的正则表达式的反斜杠中:
"\\s*\"Hello\"\\s*\"world\"\\s*"
The second hack breaks the original regex into multiple pieces and concatenates them.第二个 hack 将原始正则表达式分成多个部分并将它们连接起来。 Pieces that are string literals and contain regex backslashes are preceded by an @ character to cause the backslashes to be taken literally rather than as escape characters.
字符串文字和包含正则表达式反斜杠的部分前面有一个 @ 字符,以使反斜杠按字面意思而不是作为转义字符。 I view this as more verbose but less error prone than the previous approach:
我认为这比以前的方法更冗长但更不容易出错:
@"\s*" + '"' + "Hello" + '"' + @"\s*" + '"' + "world" + '"' + @"\s*"
I have discovered that by using the hex escape sequence \x22 to represent a double quote the same regex string can be used unaltered both in my regex development application (RegexBuddy) and in a C# string literal.我发现通过使用十六进制转义序列 \x22 来表示双引号,可以在我的正则表达式开发应用程序 (RegexBuddy) 和 C# 字符串文字中原样使用相同的正则表达式字符串。 That is, in my development application
也就是说,在我的开发应用程序中
\s*"Hello"\s*"world"\s*
can be represented directly as可以直接表示为
\s*\x22Hello\x22\s*\x22world\x22\s*
and in a C# string literal the same regex string it can be represented as在 C# 字符串文字中,它可以表示为相同的正则表达式字符串
@"\s*\x22Hello\x22\s*\x22world\x22\s*"
The string is still cluttered but at least no changes are required.字符串仍然杂乱无章,但至少不需要更改。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.