简体   繁体   English

多行正则表达式C#

[英]Multiline Group Regex c#

I've searched and searched, but I don't know how I'm doing this wrong. 我已经搜索了,但是我不知道自己是怎么做的。 I am trying to obtain the "image name" from each "block" in a text file using Regex in c#. 我正在尝试使用c#中的Regex从文本文件中的每个“块”中获取“图像名称”。 Here's what the text looks like: 文字如下所示:

begin block Block_test
  LowFlight_005_001  strip_id 5
  LowFlight_005_002  strip_id 5
  LowFlight_006_005  strip_id 6
  LowFlight_006_004  strip_id 6
  LowFlight_006_003  strip_id 6
  LowFlight_006_002  strip_id 6
  LowFlight_006_001  strip_id 6
  LowFlight_007_001  strip_id 7
  LowFlight_007_002  strip_id 7
  LowFlight_007_003  strip_id 7
  LowFlight_007_004  strip_id 7
  LowFlight_007_005  strip_id 7
  LowFlight_007_011  strip_id 7
  LowFlight_007_012  strip_id 7
  LowFlight_007_013  strip_id 7
  LowFlight_007_014  strip_id 7
end block

using this regex: 使用此正则表达式:

begin block Block_test\n(  (?<image>.*?)  (.*?\n))*end block

BUT! 但! The named group image is always just the last image, ie LowFlight_007_014. 命名的组图像始终只是最后一个图像,即LowFlight_007_014。 How do I select the image from each line. 如何从每一行中选择图像。 I've tried using the multiline flag and inserting line begins and ends like so: 我尝试使用多行标志,插入行的开始和结束是这样的:

begin block Block_test\n(^  (?<image>.*?)  (.*?$\n))*end block

Doesn't help. 无济于事。 Help me regex wizards! 帮助我正则表达式向导! I created an account just for this. 我为此专门创建了一个帐户。 Of course I can grab the whole list of images, split on newline and then clean the string array, but I would love to do it all in regex, for SCIENCE! 当然,我可以获取整个图像列表,在换行符上分割,然后清理字符串数组,但是对于科学,我很想在正则表达式中完成所有操作!

If you don't need to worry about other lines in the file, or multiple blocks, the simplest regex I can think of will be: 如果您不必担心文件中的其他行或多个块,我想到的最简单的正则表达式将是:

new Regex(@"  (?<image>\w*)  ");

which will capture multiple matches, each with one "image" group. 它将捕获多个匹配项,每个匹配项都有一个“图像”组。

If you have to think about multiple blocks in one input, I suspect you will need to use multiple regexes, to split into blocks, and then find images. 如果您必须在一个输入中考虑多个块,我怀疑您将需要使用多个正则表达式,分成多个块,然后查找图像。

If you need to find only the images from lines within blocks, then your answer seems to be in your comment: 如果您只需要从块中的行中查找图像,那么您的答案似乎在您的评论中:

begin block Block_test\r\n(  (?<image>.*?) (.*\r\n))*end block 

although you might consider using \\w: 尽管您可能会考虑使用\\ w:

begin block Block_test\r\n(  (?<image>\w*) (.*\r\n))*end block 

Try with this: 试试这个:

begin block Block_test(?'body'.*?)end block

that capture the test in the named gropu 'body', but remember to specify RegexOptions.SingleLine . 可以在名为gropu'body'的测试中捕获测试,但是请记住指定RegexOptions.SingleLine Even with the SingleLine option you can use: 即使使用SingleLine选项,您也可以使用:

begin block Block_test(\s+\S+\s+\S+\s\d)+

to have a capture for each block. 捕获每个块。

I would split this task, what about this 我会分担这个任务,那呢

String Block = "Begin block Block_test\n" +
" LowFlight_005_001  strip_id 5\n" +
" LowFlight_005_002  strip_id 5\n" +
" LowFlight_006_005  strip_id 6\n" +
" LowFlight_006_004  strip_id 6\n" +
" LowFlight_006_003  strip_id 6\n" +
" LowFlight_006_002  strip_id 6\n" +
" LowFlight_006_001  strip_id 6\n" +
"end block";

String[] lines = Regex.Split(Block, @"[\r\n]+");
Regex reg = new Regex(@"^\s*(?<image>.*?)\s+(.*?$)");

foreach (String item in lines) {
    if (!(item.StartsWith("Begin") || item.StartsWith("end"))) {
        Console.WriteLine(item);
        Match result = reg.Match(item);
        Console.WriteLine(result.Groups["image"]);
    }
}
Console.ReadLine();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM