简体   繁体   中英

How can I parse macros in C++ code, using CLANG as the parser and Python as the scripting language?

If I have the following macro in some C++ code:

_Foo(arg1, arg2)

I would like to use Python to find me all the instances and extents of that macro using Clang and the Python bindings provided with cindex.py. I do not want to use a regular expression from Python on the code directly because that gets me 99% of the way there, but not 100%. It appears to me that to get to 100%, you need to use a real C++ parser like Clang to handle all the cases where people do silly things that are syntactically correct and compile, but don't make sense to a regular expression. I need to handle 100% of the cases and since we use Clang as one of our compilers, it makes sense to use it as the parser for this task as well.

Given the following Python code I am able to find what appear to be predefined types that the Clang python bindings know about, but not macros:

def find_typerefs(node):
    ref_node = clang.cindex.Cursor_ref(node)
    if ref_node:
        print 'Found %s Type %s DATA %s Extent %s [line=%s, col=%s]' % (
            ref_node.spelling, ref_node.kind, node.data, node.extent, node.location.line, node.location.column)

# Recurse for children of this node
for c in node.get_children():
    find_typerefs(c)

index = clang.cindex.Index.create()
tu = index.parse(sys.argv[1])
find_typerefs(tu.cursor)

What I think I am looking for is a way to parse the raw AST for the name of my macro _FOO() , but I am not sure. Can someone provide some code that will allow me to pass in the name of a Macro and get back the extent or data from Clang?

You need to pass the appropriate options flag to Index.parse :

tu = index.parse(sys.argv[1], options=clang.cindex.TranslationUnit.PARSE_DETAILED_PROCESSING_RECORD)

The rest of the cursor visitor could look like this:

def visit(node):
    if node.kind in (clang.cindex.CursorKind.MACRO_INSTANTIATION, clang.cindex.CursorKind.MACRO_DEFINITION):
        print 'Found %s Type %s DATA %s Extent %s [line=%s, col=%s]' % (node.displayname, node.kind, node.data, node.extent, node.location.line, node.location.column)
    for c in node.get_children():
        visit(c)

I once wrote a script to prettyprint the whole AST you get from libclang, in order to see where to find which information.

Here it is: https://gist.github.com/2503232

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM