简体   繁体   中英

How to find all sentences except those defined using regular expressions?

The bottom line is that I need to find all the comments in some Python code and cut them out, leaving only the code itself. But I can't do it from the opposite. That is, I find the comments themselves, but I cannot find everything except them.

I tried using "?!", Made up a regular expression like "(. *) (?! #. *)". But it does not work as I expected. Just as in the code that I attached, there is an "else" that I tried to use too, that is, write to different variables, but for some reason it doesn't even go there

#include <iostream>
#include <fstream>
#include <string>
#include <regex>

int main()
{
    std::string line;
    std::string new_line;
    std::string result;
    std::string result_re;
    std::string path;
    std::smatch match;
    std::regex re("(#.*)");
    std::cout << "Enter the path\n";
    std::cin >> path;
    std::ifstream in(path);
    if (in.is_open())
    {
        while (getline(in, line))
        {
            if (std::regex_search(line, match, re))
            {
                for (int i = 0; i < match.size(); i++)
                    result_re += match[i + 1];
                    result_re += "\n";
            } 
            else
            {
                for (int i = 0; i < match.size(); i++)
                    result += match[i];
                    //result += "\n";
            }
            std::cout << line << std::endl;
        }
    }
    in.close();


    std::cout << result_re << std::endl;
    std::cout << "End of program" << std::endl;
    std::cout << result << std::endl;
    system("pause");
    return 0;
}

As I said above, I want to get everything except comments, and not the other way around. I also need to do a search for multi-line comments, which are defined in """Text""". But in this implementation, I can't even imagine how to do it, since now it is reading line by line, and a multi-line comment in this case with the help of a regulars program is impossible for me to get

I would be grateful for your advices and help.

1. don't try parsing your input file line by line. Instead suck in the whole text and let regex to replace all the comments, this way your entire program would look like this:

#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include <regex>

using namespace std;    // for brevity

int main() {

 cout << "Enter the path: ";

 string filename;
 getline(cin, filename);

 string pprg{ istream_iterator<char>(ifstream{filename, ifstream::in} >> noskipws),
              istream_iterator<char>{} };

 pprg = regex_replace(pprg, regex{"#.*"}, "");
 cout << pprg << endl;
}
  1. to handle multi-line Python literals """...""" , with C++ regex is quite uneasy to do (unlike in the example above): there are few mutually exclusive requirements (imho):
    • regex should be extended POSIX, but
    • POSIX regex does not support empty regex matches, however
    • for crafting an RE to match a negated sequence of characters a negative look-ahead assert is required, which will be an empty match :(

thus it would mean, you'd need to think and put up some programming logic to remove multi-line Python text literals

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM