简体   繁体   中英

Match regex multi-line while reading the file line by line

I have a file which I read line by line, and do the parsing with a regex which works fine. What I'm parsing is some functions names and their parameters (that's what I capture). In some files the function parameters are written multi-line, like this:

result = f(a, b,
c,
d
);

If it was written as result = f(a, b, c, d); , my regex works fine. How should I deal with the multi-line?

While waiting for a minimal complete example, you might do the following:

  • read a line
  • if the line ends with a ;(or ) depending on what you want) pass the line to the regular expression directly
  • if not continue reading the line and append them together (without newline characters of course) until a ; or ( is found;
  • pass the string resulting from all the appending to the regular expression

Currently you are reading until '\\n' . But you have to read until ';' :

#include <iostream>
#include <sstream>

int main() {
    std::stringstream file("result = f(a, b,\nc,\nd\n);\nresult = f2(a, b,\nc,\nd\n);");

    std::string function;
    while (std::getline(file, function, ';')) {
        std::cout << "function: " << (function[0] == '\n' ? "" : "\n") << function << std::endl;
    }
    return 0;
}

Additionally you can reduce each function to one line:

#include <iostream>
#include <sstream>

std::string trim(const std::string& s) {
    std::size_t b(0);
    while (s[b] == '\n') {
        ++b;
    }

    std::size_t e(s.length() - 1);
    while (s[e] == '\n') {
        --e;
    }
    return s.substr(b, e - b + 1);
}

std::string& join(std::string& s) {
    std::size_t pos(s.find('\n'));
    while (pos != s.npos) {
        s.replace(pos, 1, "");
        pos = s.find('\n', pos); // continue looking from current location
    }
    return s;
}

int main() {
    std::stringstream file("result = f(a, b,\nc,\nd\n);\nresult = f2(a, b,\nc,\nd\n);");

    std::string function;
    while (std::getline(file, function, ';')) {
        function += ';';
        std::cout << "function: " << trim(join(function)) << std::endl;
    }
    return 0;
}

trim removes linebreak before and after each function. join removes each linebreak inside a function.

Input:

result = f(a, b,
c,
d
);
result = f2(a, b,
c,
d
);

Output:

function: result = f(a, b,c,d);
function: result = f2(a, b,c,d);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM