简体   繁体   English

解决C和C ++中的typedef

[英]Resolving typedefs in C and C++

I'm trying to automatically resolve typedefs in arbitrary C++ or C projects. 我正在尝试在任意C ++或C项目中自动解析typedef。

Because some of the typedefs are defined in system header files (for example uint32 ), I'm currently trying to achieve this by running the gcc preprocessor on my code files and then scanning the preprocessed files for typedefs. 因为某些typedef是在系统头文件中定义的(例如uint32 ),所以我目前正在尝试通过在我的代码文件上运行gcc预处理器然后扫描预处理文件中的typedef来实现这一点。 I should then be able to replace the typedefs in the project's code files. 然后我应该能够替换项目代码文件中的typedef。

I'm wondering, if there is another, perhaps simpler way, I'm missing. 我想知道,如果还有另一种,也许更简单的方法,我就会失踪。 Can you think of one? 你能想到一个吗?

The reason, why I want to do this: I'm extracting code metrics from the C /C++ projects with different tools. 原因,我为什么要这样做:我正在使用不同的工具从C / C ++项目中提取代码度量。 The metrics are method-based. 指标是基于方法的。 After extracting the metrics, I have to merge the data, that is produced by the different tools. 提取指标后,我必须合并由不同工具生成的数据。 The problem is, that one of the tools resolves typedefs and others don't. 问题是,其中一个工具解析了typedef而其他工具则没有。 If there are typedefs used for the parameter types of methods, I have metrics mapped to different method-names, which are actually referring to the same method in the source code. 如果有typedef用于方法的参数类型,我将度量标准映射到不同的方法名称,这些方法名称实际上是指源代码中的相同方法。

Think of this method in the source code: int test(uint32 par1, int par2) 在源代码中考虑这个方法: int test(uint32 par1, int par2)
After running my tools I have metrics, mapped to a method named int test(uint32 par1, int par2) and some of my metrics are mapped to int test(unsigned int par1, int par2) . 运行我的工具后,我有指标,映射到名为int test(uint32 par1, int par2) ,我的一些指标映射到int test(unsigned int par1, int par2)

If you do not care about figuring out where they are defined, you can use objdump to dump the C++ symbol table which resolves typedefs. 如果您不关心确定它们的定义位置,可以使用objdump转储解析typedef的C ++符号表。

lorien$ objdump --demangle --syms foo

foo:     file format mach-o-i386

SYMBOL TABLE:
00001a24 g       1e SECT   01 0000 .text dyld_stub_binding_helper
00001a38 g       1e SECT   01 0000 .text _dyld_func_lookup
...
00001c7c g       0f SECT   01 0080 .text foo::foo(char const*)
...

This snippet is from the following structure definition: 此代码段来自以下结构定义:

typedef char const* c_string;
struct foo {
    typedef c_string ntcstring;
    foo(ntcstring s): buf(s) {}
    std::string buf;
};

This does require that you compile everything and it will only show symbols in the resulting executable so there are a few limitations. 这确实需要您编译所有内容,它只会在生成的可执行文件中显示符号,因此存在一些限制。 The other option is to have the linker dump a symbol map. 另一种选择是让链接器转储符号映射。 For GNU tools add -Wl,-map and -Wl,name where name is the name of the file to generate (see note). 对于GNU工具,添加-Wl,-map-Wl,name其中name是要生成的文件的名称(请参阅注释)。 This approach does not demangle the names, but with a little work you can reverse engineer the compiler's mangling conventions. 这种方法不会对名称进行解码,但只需要做一些工作就可以对编译器的修改约定进行逆向工程。 The output from the previous snippet will include something like: 上一个代码段的输出将包含以下内容:

0x00001CBE  0x0000005E  [  2] __ZN3fooC2EPKc
0x00001D1C  0x0000001A  [  2] __ZN3fooC1EPKc

You can decode these using the C++ ABI specification. 您可以使用C ++ ABI规范对这些进行解码。 Once you get comfortable with how this works, the mangling table included with the ABI becomes priceless. 一旦你对它的工作原理感到满意,ABI附带的破碎表就变得无价了。 The derivation in this case is: 在这种情况下的推导是:

<mangled-name>           ::= '_Z' <encoding>
<encoding>               ::= <name> <bare-function-type>
  <name>                 ::= <nested-name>
    <nested-name>        ::= 'N' <source-name> <ctor-dtor-name> 'E'
      <source-name>      ::= <number> <identifier>
      <ctor-dtor-name>   ::= 'C2' # base object constructor
    <bare-function-type> ::= <type>+
      <type>             ::= 'P' <type> # pointer to
        <type>           ::= <cv-qualifier> <type>
          <cv-qualifier> ::= 'K' # constant
            <type>       ::= 'c' # character

Note: it looks like GNU changes the arguments to ld so you may want to check your local manual ( man ld ) to make sure that the map file generation commands are -map filename in your version. 注意:看起来GNU会将参数更改为ld因此您可能需要检查本地手册( man ld )以确保您的版本中的地图文件生成命令是-map filename In recent versions, use -Wl,-M and redirect stdout to a file . 在最新版本中, 使用-Wl,-M并将stdout重定向到文件

You can use Clang (the LLVM C/C++ compiler front-end) to parse code in a way that preserves information on typedefs and even macros. 您可以使用Clang(LLVM C / C ++编译器前端)以保留typedef甚至宏的信息的方式解析代码。 It has a very nice C++ API for reading the data after the source code is read into the AST (abstract syntax tree). 它有一个非常好的C ++ API,用于在将源代码读入AST(抽象语法树)后读取数据。 http://clang.llvm.org/ http://clang.llvm.org/

If you are instead looking for a simple program that already does the resolving for you (instead of the Clang programming API), I think you are out of luck, as I have never seen such a thing. 如果您正在寻找一个已经为您解决的简单程序(而不是Clang编程API),我认为您运气不好,因为我从未见过这样的事情。

GCC-XML can help with resolving the typedefs, you'd have to follow the type-ids of <Typedef> elements until you resolved them to a <FundamentalType> , <Struct> or <Class> element. GCC-XML可以帮助解析typedef,你必须遵循<Typedef>元素的type-id,直到你将它们解析为<FundamentalType><Struct><Class>元素。

For replacing the typedefs in your project you have a more fundamental problem though: you can't simply search and replace as you'd have to respect the scope of names - think of eg function-local typedefs, namespace aliases or using directives. 为了替换项目中的typedef,你有一个更基本的问题:你不能简单地搜索和替换,因为你必须尊重名称的范围 - 想想例如函数本地typedef,命名空间别名或using指令。

Depending on what you're actually trying to achieve, there has to be a better way. 根据您实际想要实现的目标,必须有更好的方法。

Update: Actually, in the given context of fixing metrics data, the replacement for the typenames using gcc-xml should work fine if it supports your code-base. 更新:实际上,在修复指标数据的给定上下文中,如果支持您的代码库,则使用gcc-xml替换类型名称应该可以正常工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM