简体   繁体   中英

Multiline Group Regex c#

I've searched and searched, but I don't know how I'm doing this wrong. I am trying to obtain the "image name" from each "block" in a text file using Regex in c#. Here's what the text looks like:

begin block Block_test
  LowFlight_005_001  strip_id 5
  LowFlight_005_002  strip_id 5
  LowFlight_006_005  strip_id 6
  LowFlight_006_004  strip_id 6
  LowFlight_006_003  strip_id 6
  LowFlight_006_002  strip_id 6
  LowFlight_006_001  strip_id 6
  LowFlight_007_001  strip_id 7
  LowFlight_007_002  strip_id 7
  LowFlight_007_003  strip_id 7
  LowFlight_007_004  strip_id 7
  LowFlight_007_005  strip_id 7
  LowFlight_007_011  strip_id 7
  LowFlight_007_012  strip_id 7
  LowFlight_007_013  strip_id 7
  LowFlight_007_014  strip_id 7
end block

using this regex:

begin block Block_test\n(  (?<image>.*?)  (.*?\n))*end block

BUT! The named group image is always just the last image, ie LowFlight_007_014. How do I select the image from each line. I've tried using the multiline flag and inserting line begins and ends like so:

begin block Block_test\n(^  (?<image>.*?)  (.*?$\n))*end block

Doesn't help. Help me regex wizards! I created an account just for this. Of course I can grab the whole list of images, split on newline and then clean the string array, but I would love to do it all in regex, for SCIENCE!

If you don't need to worry about other lines in the file, or multiple blocks, the simplest regex I can think of will be:

new Regex(@"  (?<image>\w*)  ");

which will capture multiple matches, each with one "image" group.

If you have to think about multiple blocks in one input, I suspect you will need to use multiple regexes, to split into blocks, and then find images.

If you need to find only the images from lines within blocks, then your answer seems to be in your comment:

begin block Block_test\r\n(  (?<image>.*?) (.*\r\n))*end block 

although you might consider using \\w:

begin block Block_test\r\n(  (?<image>\w*) (.*\r\n))*end block 

Try with this:

begin block Block_test(?'body'.*?)end block

that capture the test in the named gropu 'body', but remember to specify RegexOptions.SingleLine . Even with the SingleLine option you can use:

begin block Block_test(\s+\S+\s+\S+\s\d)+

to have a capture for each block.

I would split this task, what about this

String Block = "Begin block Block_test\n" +
" LowFlight_005_001  strip_id 5\n" +
" LowFlight_005_002  strip_id 5\n" +
" LowFlight_006_005  strip_id 6\n" +
" LowFlight_006_004  strip_id 6\n" +
" LowFlight_006_003  strip_id 6\n" +
" LowFlight_006_002  strip_id 6\n" +
" LowFlight_006_001  strip_id 6\n" +
"end block";

String[] lines = Regex.Split(Block, @"[\r\n]+");
Regex reg = new Regex(@"^\s*(?<image>.*?)\s+(.*?$)");

foreach (String item in lines) {
    if (!(item.StartsWith("Begin") || item.StartsWith("end"))) {
        Console.WriteLine(item);
        Match result = reg.Match(item);
        Console.WriteLine(result.Groups["image"]);
    }
}
Console.ReadLine();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM