I am trying to analyze c++
files using my custom made parser (written in c++
). Before start parsing, I will like to get rid of all #define
. I want the source file to be compilable after preprocessing. So best way will be to run C Preprocessor
on the file.
cpp myfile.cpp temp.cpp
// or
g++ -E myfile.cpp > templ.cpp
[New suggestions are welcome.]
But due to this, the original lines and their line numbers will be lost as the file will contain all the header information also and I want to retain the line numbers. So the way out I have decided is,
For example, a typical source file will look like:
#include<iostream>
#include"xyz.h"
int x;
#define SOME value
/*
** This is a test file
*/
typedef char* cp;
void myFunc (int* i, ABC<int, X<double> > o)
{
//...
}
class B {
};
After adding symbol it will be like,
#include<iostream>
#include"xyz.h"
@3@int x;
#define SOME value
@5@/*
@6@** This is a test file
@7@*/
@8@typedef char* cp;
@9@
@10@void myFunc (int* i, ABC<int, X<double> > o)
@11@{
@12@ //...
@13@}
@14@
@15@class B {
@16@};
Once all the macros and comments are removed, I will be left with thousands of line in which few hundred will be the original source code.
Is this approach correct? Am I missing any corner case?
You realize that g++ -E adds some of its own lines to its output which indicate line numbers in the original file? You'll find lines like
# 2 "foo.cc" 2
which indicate that you're looking at line 2 of file foo.cc. These lines are inserted whenever the regular sequence of lines is disrupted.
The imake
program that used to come with X11 sources used a faintly similar system, marking the ends of lines with @@
so that it could post-process them properly.
The output from gcc -E
usually includes #line
directives; you could perhaps use those instead of your symbols.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.