简体   繁体   English

GCC预处理程序PLUS静态分析?

[英]GCC preprocessor PLUS static analysis?

I am aware of how to run the gcc preprocessor . 我知道如何运行gcc预处理程序 Gcc evidently performs a static analysis/optimization of the code, because if you eg add two constants, or shift a constant, the resulting assembly code is the same whether you keep "a = constant << 3" or if you perform that operation manually and have "a = shifted_constant" in the code. Gcc显然会执行代码的静态分析/优化,因为如果您添加两个常量或移位常量,则无论您保持“ a = constant << 3”还是手动执行该操作,生成的汇编代码都是相同的并在代码中包含“ a = shifted_constant”。

Given this slice from a SHA256 code, 给定SHA256代码中的切片,

  A=0x6a09e667;
  B=0xbb67ae85;
  C=0x3c6ef372;
  D=0xa54ff53a;
  E=0x510e527f;
  F=0x9b05688c;
  G=0x1f83d9ab;
  H=0x5be0cd19;

  P(A, B, C, D, E, F, G, H, W[ 0], 0x428a2f98);

after running gcc -EI get 运行gcc -EI后

 { tmp = ((((E) >> (6)) | ((E) << (32-(6)))) ^ (((E) >> (11))
       | ((E) << (32-(11)))) ^ (((E) >> (25)) | ((E) << (32-(25)))))
       + (G ^ (E & (F ^ G))) + 0x428a2f98 + W[ 0]; D += (tmp + H);
   H  += tmp + ((((A) >> (2)) | ((A) << (32-(2)))) ^ (((A) >> (13))
       | ((A) << (32-(13)))) ^ (((A) >> (22)) | ((A) << (32-(22)))))
       + ((A & B) | (C & (A | B))); };

For the P-line, which is a nested macro of macros. 对于P线,它是宏的嵌套宏。

That's nice, but I would like to have a more digested result, where constants have been replaced, constant arithmetics have been performed etc. So if I got instead of the above something like this: 很好,但是我想得到一个更摘要的结果,其中替换了常量,执行了常量算术等。因此,如果我得到的不是上述内容,则为:

{ tmp = hex_constant1 + W[0]; D += (tmp + 0x5be0cd19); H += tmp + hex_constant2; };

It would be much more to my liking. 我更喜欢。 I hadn't the patience to manually compute the expressions, but basically they should fold to two hex constants. 我没有耐心手动计算表达式,但是基本上它们应该折叠为两个十六进制常量。

Is there any tool / command line option to perform this? 是否有任何工具/命令行选项可以执行此操作?

GCC do not optimize preprocessed C code. GCC不会优化预处理的C代码。 After preprocessing, this code translates to high-level IR (GIMPLE), where most optimizations are applied, before lowering to low-level IR like RTL. 经过预处理后,此代码会转换为高级IR(GIMPLE),在大多数情况下都会应用该优化,然后再降低为像RTL这样的低级IR。

What do you want is optimized GIMPLE dump to look up these to constants. 您想要的是经过优化的GIMPLE转储以将其查找为常数。 You shall use optimized tree dump like this: 您应该像这样使用优化的树转储:

gcc -O2 -fdump-tree-optimized test.c -S

Then look for file like test.c.211t.optimized (number 211 may differ and depends on gcc version). 然后查找类似test.c.211t.optimized的文件(数字211可能有所不同,具体取决于gcc版本)。 Gimple code is very similar to primitive C, so you will have no problems to read your constants from it. Gimple代码与原始C非常相似,因此您可以从中读取常量没有问题。

Most compilers run the preprocessor to get a text string for the program source, and then parse that to compiler internal data structures. 大多数编译器运行预处理器以获取程序源的文本字符串,然后将其解析为编译器内部数据结构。 Once the program is parsed, you cannot see what the compiler does to the code (well, maybe the compiler has some kind of debugging dump, but you cannot count on that). 解析完程序后,您将看不到编译器对代码的处理方式(嗯,也许编译器具有某种调试转储,但您不能指望它)。

To do what you want, you need to do partial evaluation of the code at the location of the expanded macro. 若要执行所需的操作,您需要在扩展的宏位置对代码进行部分评估 No standard compiler will help you do this. 没有标准的编译器可以帮助您做到这一点。

You need a program transformation system (PTS) , which is tool that parses source code, applies code transformations, and then will regenerate the modified source text. 您需要一个程序转换系统(PTS) ,该程序可以解析源代码,应用代码转换,然后重新生成修改后的源文本。 A good PTS will let you write sets of source-to-source transformation rules to be applied, each rule in the form of: 一个好的PTS将使您编写要应用的源到源转换规则集,每个规则的形式为:

 when you see *this*, then replace it by *that* if *condition* is true

To do simple constant folding, you need rules like: 要进行简单的常量折叠,您需要以下规则:

  rule  simplify_times_zero(e: power): product -> product
       " \e * 0 " ->  " \e ";

which handles the special (algebraic) case of multiplying something times zero. 处理特殊乘数乘以零的特殊(代数)情况。 You obviously need a bunch of these rules, more below. 您显然需要一堆这些规则,更多内容请参见下文。

Your PTS must also be able to read the specific dialect of C or C++ you are using, and be able to resolve any identifier to its definition (otherwise the cannot know that E is constant where it used). 您的PTS还必须能够读取您正在使用的C或C ++的特定方言,并且能够将任何标识符解析为其定义(否则,您将无法知道E在使用时是常数)。

The only PTS I know of that can parse many C or C++ dialects , and carry out macro expansions, is our DMS Software Reengineering Toolkit . 据我所知,唯一可以解析许多C或C ++方言并进行宏扩展的PTS是我们的DMS软件再造工具包 It accepts rules using the syntax of the above example. 它使用上面示例的语法接受规则。

The additional rules you need are rules that actually do arithmetic: 您需要的其他规则是实际进行算术运算的规则:

rule fold_addition(c1: INT32CONSTANT, c2: INT32CONSTANT): product -> product =
     " \c1 + \c2 " ->  int32_multiply(c1,c2);

where int32_multiply is a function that does the math. 其中int32_multiply是执行数学运算的函数。 You need one of these for each operator and operand type, that honor the rules of the C and C++ language. 对于每个运算符和操作数类型,您都需要使用其中之一,以遵守C和C ++语言的规则。

You also need rules that do substitution of known values: 您还需要执行替换已知值的规则:

rule substitute_defined_constant(i: IDENTIFIER): primitive -> primitive
  " \i " ->  macro_definition_value(i) if  is_defined_constant(i);

where is_defined_constant looks up and identifier in the symbol table built by DMS's front end for macro definitions, and checks that i is macro of the form of "define i " at the point where i is referenced. is_defined_constant在DMS前端为宏定义而建立的符号表中查找和标识符,并在引用i的地方检查i是否为“ define i”形式的宏。 By writing this as a conditional, the macro_definition_value function isn't called unless the macro definition value actually exists. 通过将其写为条件,除非宏定义值实际存在,否则不会调用macro_definition_value函数。 The raw symbol table support is provided by DMS's C and C++ front ends. DMS的C和C ++前端提供了原始符号表支持。

With this set of rules, and a rule strategy that applies these rules to the macro expansion point of interest, DMS should be able to "fold" the equation into the form you expressed. 有了这套规则,以及将这些规则应用于感兴趣的宏扩展点的规则策略,DMS应该能够将方程式“折叠”为您表示的形式。 Now, all the transforms are done on DMS's internal representations of the program, but DMS can prettyprint the result so you could actually see this. 现在,所有转换都在DMS的程序内部表示上完成,但是DMS可以漂亮地打印结果,以便您可以实际看到它。

As a general capability in a development UI, this might be pretty useful. 作为开发UI中的常规功能,这可能非常有用。 To actually set all this up and make it work in practice is probably a few days; 实际设置所有这些并使其付诸实践可能需要几天的时间。 DMS is a complex system, and C and C++ don't help. DMS是一个复杂的系统,而C和C ++则无济于事。

If you are only going to do this once or twice, you are likely better off just biting the bullet and doing it by hand (or as another answer suggests, use a debugging to inspect the output of the compiler if that works out). 如果您只打算执行一次或两次,那么最好只是咬一下子弹,然后手工完成(或者如其他答案所示,如果可以,请使用调试检查编译器的输出)。 If it is a daily task, the PTS solution is likely a real time (and accuracy) saver. 如果这是日常任务,则PTS解决方案可能会节省大量时间(和准确性)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM