简体   繁体   English

如何使用CLANG作为解析器并使用Python作为脚本语言来解析C ++代码中的宏?

[英]How can I parse macros in C++ code, using CLANG as the parser and Python as the scripting language?

If I have the following macro in some C++ code: 如果我在某些C ++代码中有以下宏:

_Foo(arg1, arg2)

I would like to use Python to find me all the instances and extents of that macro using Clang and the Python bindings provided with cindex.py. 我想使用Python来找到使用Clang和cindex.py提供的Python绑定的宏的所有实例和范围。 I do not want to use a regular expression from Python on the code directly because that gets me 99% of the way there, but not 100%. 我不想直接在代码上使用Python的正则表达式,因为这样可以获得99%的方式,但不是100%。 It appears to me that to get to 100%, you need to use a real C++ parser like Clang to handle all the cases where people do silly things that are syntactically correct and compile, but don't make sense to a regular expression. 在我看来,要达到100%,你需要使用像Clang这样的真正的C ++解析器来处理人们做语法正确和编译的愚蠢事情的所有情况,但对正则表达式没有意义。 I need to handle 100% of the cases and since we use Clang as one of our compilers, it makes sense to use it as the parser for this task as well. 我需要处理100%的情况,因为我们使用Clang作为编译器之一,所以将它用作此任务的解析器也是有意义的。

Given the following Python code I am able to find what appear to be predefined types that the Clang python bindings know about, but not macros: 鉴于以下Python代码,我能够找到Clang python绑定所知道的预定义类型,而不是宏:

def find_typerefs(node):
    ref_node = clang.cindex.Cursor_ref(node)
    if ref_node:
        print 'Found %s Type %s DATA %s Extent %s [line=%s, col=%s]' % (
            ref_node.spelling, ref_node.kind, node.data, node.extent, node.location.line, node.location.column)

# Recurse for children of this node
for c in node.get_children():
    find_typerefs(c)

index = clang.cindex.Index.create()
tu = index.parse(sys.argv[1])
find_typerefs(tu.cursor)

What I think I am looking for is a way to parse the raw AST for the name of my macro _FOO() , but I am not sure. 我认为我正在寻找的是一种解析原始AST为我的宏_FOO()的名称的方法,但我不确定。 Can someone provide some code that will allow me to pass in the name of a Macro and get back the extent or data from Clang? 有人可以提供一些代码,允许我传入宏的名称并从Clang获取范围或数据吗?

You need to pass the appropriate options flag to Index.parse : 您需要将适当的options标志传递给Index.parse

tu = index.parse(sys.argv[1], options=clang.cindex.TranslationUnit.PARSE_DETAILED_PROCESSING_RECORD)

The rest of the cursor visitor could look like this: 游标访问者的其余部分可能如下所示:

def visit(node):
    if node.kind in (clang.cindex.CursorKind.MACRO_INSTANTIATION, clang.cindex.CursorKind.MACRO_DEFINITION):
        print 'Found %s Type %s DATA %s Extent %s [line=%s, col=%s]' % (node.displayname, node.kind, node.data, node.extent, node.location.line, node.location.column)
    for c in node.get_children():
        visit(c)

I once wrote a script to prettyprint the whole AST you get from libclang, in order to see where to find which information. 我曾经写过一个脚本来重新绘制你从libclang获得的整个AST,以便查看在哪里可以找到哪些信息。

Here it is: https://gist.github.com/2503232 这是: https//gist.github.com/2503232

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM