简体   繁体   中英

Extract matching lines between two other patterns

I am trying to use regexes in C++ to extracts lines that match a certain word—from within regions in a file bounded by two other patterns. I also want to print the line number of each match.

I am currently running a perl command using popen , but I would like to do it with C++:

perl -ne 'if ((/START/ .. /END/) && /test/) {print "line$.:$_"}' file

This command finds regions between START and END and then from those extracts lines containing the word test .

How do I do this with regexes in C++?

The semantics of Perl's .. are subtle. The code below emulates both .. and the while (<>) { ... } implied by the -n switch to perl .

#include <fstream>
#include <iostream>
#include <regex>
#include <vector>

// emulate Perl's .. operator
void flipflop(bool& inside, const std::regex& start, const std::regex& end, const std::string& str)
{
  if (!inside && std::regex_match(str, start))
    inside = true;
  else if (inside && std::regex_match(str, end))
    inside = false;
}

int main(int argc, char *argv[])
{
  // extra .* wrappers to use regex_match in order to work around
  // problems with regex_search in GNU libstdc++
  std::regex start(".*START.*"), end(".*END.*"), match(".*test.*");

  for (const auto& path : std::vector<std::string>(argv + 1, argv + argc)) {
    std::ifstream in(path);
    std::string str;
    bool inside = false;
    int line = 0;
    while (std::getline(in, str)) {
      ++line;
      flipflop(inside, start, end, str);
      if (inside && std::regex_match(str, match))
        std::cout << path << ':' << line << ": " << str << '\n';

      // Perl's .. becomes false AFTER the rhs goes false,
      // so keep this last to allow match to succeed on the
      // same line as end
      flipflop(inside, start, end, str);
    }
  }

  return 0;
}

For example, consider the following input.

test ERROR 1
START
test
END
test ERROR 2
START
foo ERROR 3
bar ERROR 4
test 1
baz ERROR 5
END
test ERROR 6
START sldkfjsdflkjsdflk
test 2
END
lksdjfdslkfj
START
dslfkjs
sdflksj
test 3
END dslkfjdsf

Sample runs:

$ ./extract.exe file
file:3: test
file:9: test 1
file:14: test 2
file:20: test 3

$ ./extract.exe file file
file:3: test
file:9: test 1
file:14: test 2
file:20: test 3
file:3: test
file:9: test 1
file:14: test 2
file:20: test 3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM