简体   繁体   English

正则表达式多行奇怪的行为

[英]Regex multiline strange behaviour

I have a string like this: 我有一个像这样的字符串:

string text = "ext_bus      0  0/0/3/0.0      side         CLAIMED     INTERFACE    IDE Primary Channel\r\ntarget       0  0/0/3/0.0.0    tgt          CLAIMED     DEVICE       \r\ndisk         0  0/0/3/0.0.0.0  sdisk";

When I do a regex multiline search to get the text in ext_bus third column (0/0/3/0.0) and last column (IDE Primary Channel): 当我进行正则表达式多行搜索以获取ext_bus第三列(0/0/3 / 0.0)和最后一列(IDE Primary Channel)中的文本时:

Regex regExp = new Regex(@"^ext_bus\s*[0-9]+\s*(?<HWPath>\S+).*\s{2,}(?<BusName>.*?)\r?$", RegexOptions.Multiline);

The first group is OK: "0/0/3/0.0" 第一组正常:“ 0/0/3 / 0.0”

But the second group is the next line!: "target 0 0/0/3/0.0.0 tgt CLAIMED DEVICE " 但是第二组是下一行!:“ target 0 0/0/3 / 0.0.0 tgt CLAIMED DEVICE”

How can this be possible with Multiline (only one line), and how can I get the last column (the text at the end of the string after 2 or more whitespaces). 多行(仅一行)如何实现?如何获得最后一列(2个或更多空格之后的字符串末尾的文本)?

The short answer is that it is because the first .* in your regex matches up till the end of the first line, then the \\s{2,} matches the newline characters, then the (?<BusName>.*?) will match all of the second line. 简短的答案是因为正则表达式中的第一个.*匹配到第一行的末尾,然后\\s{2,}匹配换行符,然后(?<BusName>.*?)将匹配所有第二行。

Multiline mode means that ^ and $ match the start and end of a line, not just the start and end of the whole string. Multiline模式意味着^$匹配行的开头和结尾,而不仅仅是整个字符串的开头和结尾。

Remove the .* and then <BusName> will be the rest of the text on the line after the whitespace following 0/0/3/0.0 . 删除.* ,然后<BusName>将是<BusName> 0/0/3/0.0后面的空白行之后的其余文本。

Why do you use regex? 为什么使用正则表达式?

You can do it easily with split 您可以轻松地进行拆分

string value = "ext_bus      0  0/0/3/0.0      side         CLAIMED     INTERFACE    IDE Primary Channel\r\ntarget       0  0/0/3/0.0.0    tgt          CLAIMED     DEVICE       \r\ndisk         0  0/0/3/0.0.0.0  sdisk";
char[] delimiters = new char[] {' ' }; // here you can add more seperaors
string[] parts = value.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < parts.Length; i++)
{
    Console.WriteLine(parts[i]);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM