简体   繁体   English

为什么 .NET 多行正则表达式中的 $ 不匹配 CRLF?

[英]Why doesn't $ in .NET multiline regular expressions match CRLF?

I have noticed the following:我注意到以下几点:

var b1 = Regex.IsMatch("Line1\nLine2", "Line1$", RegexOptions.Multiline);   // true
var b2 = Regex.IsMatch("Line1\r\nLine2", "Line1$", RegexOptions.Multiline); // false

I'm confused.我糊涂了。 The documentation of RegexOptions says: RegexOptions的文档说:

Multiline : Multiline mode.多行:多行模式。 Changes the meaning of ^ and $ so they match at the beginning and end, respectively, of any line, and not just the beginning and end of the entire string.更改 ^ 和 $ 的含义,使它们分别匹配任何行的开头和结尾,而不仅仅是整个字符串的开头和结尾。

Since C# and VB.NET are mainly used in the Windows world, I would guess that most files processed by .NET applications use CRLF linebreaks ( \\r\\n ) rather than LF linebreaks ( \\n ).由于 C# 和 VB.NET 主要用于 Windows 世界,我猜想 .NET 应用程序处理的大多数文件使用 CRLF 换行符 ( \\r\\n ) 而不是 LF 换行符 ( \\n )。 Still, it seems that the .NET regular expression parser does not recognize a CRLF linebreak as an end of line .尽管如此,.NET 正则表达式解析器似乎无法将 CRLF 换行符识别为行尾

I know that I could workaround this, for example, by matching Line1\\r?$ , but it still strikes me as strange.我知道我可以解决这个问题,例如,通过匹配Line1\\r?$ ,但它仍然让我感到奇怪。 Is this really the intended behaviour of the .NET regexp parser or did I miss some hidden UseWindowsLinebreaks option?这真的是 .NET regexp 解析器的预期行为还是我错过了一些隐藏的UseWindowsLinebreaks选项?

From MSDN:来自 MSDN:

By default, $ matches only the end of the input string.默认情况下,$ 仅匹配输入字符串的结尾。 If you specify the RegexOptions.Multiline option, it matches either the newline character (\\n) or the end of the input string.如果您指定 RegexOptions.Multiline 选项,它将匹配换行符 (\\n) 或输入字符串的结尾。 It does not, however, match the carriage return/line feed character combination.但是,它不匹配回车/换行字符组合。 To successfully match them, use the subexpression \\r?$ instead of just $.要成功匹配它们,请使用子表达式 \\r?$ 而不是 $。

http://msdn.microsoft.com/en-us/library/yd1hzczs.aspx#Multiline http://msdn.microsoft.com/en-us/library/yd1hzczs.aspx#Multiline

So I can't say why (compatibility with regular expressions from other languages?), but at the very least it's intended.所以我不能说为什么(与其他语言的正则表达式兼容?),但至少它是有意的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM