简体   繁体   English

C#中是否有无需修改即可使用的正则表达式双引号匹配语法?

[英]Is there a regex double-quote matching syntax that can be used without modification in C#?

The following simple regex includes four double-quotes that must be matched.以下简单的正则表达式包含四个必须匹配的双引号。 I'm not attempting to come up with a solution for this particular regex but am merely using it as a general example:我并没有试图为这个特定的正则表达式提出解决方案,而只是将其用作一般示例:

\s*"Hello"\s*"world"\s*

The problem I've always encountered when writing C# code that contains regexes that must match double-quotes is the cumbersome syntax I've had to use because string literals in C# are double-quote delimited.在编写包含必须匹配双引号的正则表达式的 C# 代码时,我一直遇到的问题是我不得不使用的繁琐语法,因为 C# 中的字符串文字是双引号分隔的。 I've used the two different techniques below, neither of which I like.我使用了以下两种不同的技术,我都不喜欢这两种技术。 Aside from the additional complexity required to butcher the original regex into acceptable C# syntax, converting that syntax back into the original regex for additional development is a real pain.除了将原始正则表达式转换为可接受的 C# 语法所需的额外复杂性之外,将该语法转换回原始正则表达式以进行额外开发是一个真正的痛苦。 Is there any form that would be equally acceptable to both the regex engine and the C# language parser?正则表达式引擎和 C# 语言解析器是否同样可以接受任何形式?

The first hack uses escape characters to escape the backslashes and double quotes that must appear literally in the regex.第一个 hack 使用转义字符来转义必须出现在正则表达式中的反斜杠和双引号。 I view this as the most error prone approach because you get buried in backslashes for more complex regexes:我认为这是最容易出错的方法,因为您会陷入更复杂的正则表达式的反斜杠中:

"\\s*\"Hello\"\\s*\"world\"\\s*"

The second hack breaks the original regex into multiple pieces and concatenates them.第二个 hack 将原始正则表达式分成多个部分并将它们连接起来。 Pieces that are string literals and contain regex backslashes are preceded by an @ character to cause the backslashes to be taken literally rather than as escape characters.字符串文字和包含正则表达式反斜杠的部分前面有一个 @ 字符,以使反斜杠按字面意思而不是作为转义字符。 I view this as more verbose but less error prone than the previous approach:我认为这比以前的方法更冗长但更不容易出错:

@"\s*" + '"' + "Hello" + '"' + @"\s*" + '"' + "world" + '"' + @"\s*"

@"\s*""Hello""\s*""world""\s*" gives the string \s*"Hello"\s*"world"\s* . @"\s*""Hello""\s*""world""\s*"给出字符串\s*"Hello"\s*"world"\s* Simply double the double quotes in an @ prepended string (AKA verbatim string) to display a double quotes只需将@前置字符串(AKA 逐字字符串)中的双引号加倍以显示双引号

Fiddle小提琴

I have discovered that by using the hex escape sequence \x22 to represent a double quote the same regex string can be used unaltered both in my regex development application (RegexBuddy) and in a C# string literal.我发现通过使用十六进制转义序列 \x22 来表示双引号,可以在我的正则表达式开发应用程序 (RegexBuddy) 和 C# 字符串文字中原样使用相同的正则表达式字符串。 That is, in my development application也就是说,在我的开发应用程序中

\s*"Hello"\s*"world"\s*

can be represented directly as可以直接表示为

\s*\x22Hello\x22\s*\x22world\x22\s*

and in a C# string literal the same regex string it can be represented as在 C# 字符串文字中,它可以表示为相同的正则表达式字符串

@"\s*\x22Hello\x22\s*\x22world\x22\s*"

The string is still cluttered but at least no changes are required.字符串仍然杂乱无章,但至少不需要更改。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM