简体   繁体   中英

Parsing a C++ source file after preprocessing

I am trying to analyze c++ files using my custom made parser (written in c++ ). Before start parsing, I will like to get rid of all #define . I want the source file to be compilable after preprocessing. So best way will be to run C Preprocessor on the file.

cpp myfile.cpp temp.cpp
// or
g++ -E myfile.cpp > templ.cpp

[New suggestions are welcome.]

But due to this, the original lines and their line numbers will be lost as the file will contain all the header information also and I want to retain the line numbers. So the way out I have decided is,

  1. Add a special symbol before every line in the source file (except preprocessors)
  2. Run the preprocessor
  3. Extract the lines with that special symbol and analyze them

For example, a typical source file will look like:

#include<iostream>
#include"xyz.h"
int x;    
#define SOME value
/*
**  This is a test file
*/
typedef char* cp;

void myFunc (int* i, ABC<int, X<double> > o)
{
  //...
}

class B {
};

After adding symbol it will be like,

#include<iostream>
#include"xyz.h"
@3@int x;    
#define SOME value
@5@/*
@6@**  This is a test file
@7@*/
@8@typedef char* cp;
@9@
@10@void myFunc (int* i, ABC<int, X<double> > o)
@11@{
@12@  //...
@13@}
@14@
@15@class B {
@16@};

Once all the macros and comments are removed, I will be left with thousands of line in which few hundred will be the original source code.

Is this approach correct? Am I missing any corner case?

You realize that g++ -E adds some of its own lines to its output which indicate line numbers in the original file? You'll find lines like

# 2 "foo.cc" 2

which indicate that you're looking at line 2 of file foo.cc. These lines are inserted whenever the regular sequence of lines is disrupted.

The imake program that used to come with X11 sources used a faintly similar system, marking the ends of lines with @@ so that it could post-process them properly.

The output from gcc -E usually includes #line directives; you could perhaps use those instead of your symbols.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM