简体繁体 English

静态和动态代码分析

[英]static and dynamic code analysis

原文 2012-10-14 17:21:03 4 2 code-analysis/ static-code-analysis/ dynamic-analysis

I found several questions about this topic, and all of them with lot of references, but still I don't have a clear idea about that, because most of the references speak about concrete tools and not about the concept in general of the analysis. 我发现了关于这个主题的几个问题，并且所有这些问题都有很多参考，但我仍然没有清楚的想法，因为大多数参考文献都是关于具体工具而不是关于分析的一般概念。 Thus I have some questions: 因此我有一些问题：

About Static analysis: 1. I would like to have a reference, or a summary of which techniques are successful and have more relevance nowadays. 关于静态分析：1。我想有一个参考，或者总结哪些技术是成功的，并且现在更具相关性。 2. What really can they do about discovering bugs, can we make a summary or it is depending of the tool? 2.他们真正可以做些什么来发现错误，我们可以做一个摘要，还是取决于工具？

About symbolic execution: 1. Where could be enclose symbolic execution? 关于符号执行：1。哪里可以包含符号执行？ I guess depending of the approach, I would like to know if they are dynamic analysis, or mix of static and dynamic analysis if it is possible to determine. 我想根据方法，我想知道它们是动态分析，还是静态和动态分析的组合，如果可以确定的话。

I found problems to differentiated the two different techniques in the tools, even I think I know the theoretical difference. 我发现了在工具中区分两种不同技术的问题，即使我认为我知道理论上的差异。

I'm actually working with C Thanks in advance 我实际上是在提前与C合作

2 个解决方案

I'm trying to give a short answer: 我想简单回答一下：

Static analysis looks at the syntactical structure of code and draws conclusions about the program behavior. 静态分析着眼于代码的语法结构，并得出关于程序行为的结论。 These conclusions must not always be correct. 这些结论必须始终不正确。

A typical example of static analysis is data flow analysis, where you compute sets like used , read , write for every statement. 静态分析的典型示例是数据流分析，您可以在其中计算每个语句的used ， read ， write等集合。 This will help to find eg uninitialized values. 这将有助于找到例如未初始化的值。

You can also analyze the code regarding code-patterns. 您还可以分析有关代码模式的代码。 This way, these tools can be used to check if you are complying to a specific coding standard. 这样，这些工具可用于检查您是否符合特定的编码标准。 A prominent coding standard example is MISRA. 一个突出的编码标准示例是MISRA。 This coding standard is used for safety critical systems and avoids problematic constructs in C. This way you can already say a lot about the robustness of your applications against memory leaks, dangling pointers, etc. 此编码标准用于安全关键系统，并避免使用C语言中存在问题的结构。这样，您就已经可以说很多关于应用程序对内存泄漏，悬空指针等的健壮性。

Dynamic analysis is not looking at the syntax only, but takes state information into account. 动态分析不仅仅考虑语法，而是考虑状态信息。 In symbolic execution, you are adding assumptions about the possible values of all variables to the statements. 在符号执行中，您将向语句添加有关所有变量的可能值的假设。

The most expensive and powerful method of dynamic analysis is model checking, where you really look at all possible execution states of the system. 最昂贵和最强大的动态分析方法是模型检查，您可以在其中查看系统的所有可能的执行状态。 You can think of a model checked system as a system that is tested with 100% coverage - but there are of course a lot of practical problems that prevent real systems to be checked that way. 您可以将模型检查系统视为以100％覆盖率进行测试的系统 - 但是当然存在许多阻止以这种方式检查实际系统的实际问题。

These methods are very powerful, and you can gain a lot from the static code analysis tools especially when combined with a good coding standard. 这些方法非常强大，您可以从静态代码分析工具中获得很多，特别是与良好的编码标准相结合时。

A feature my software team found really impressive is eg that it will tell you in C++ when a class with virtual methods does not have a virtual destructor. 我的软件团队发现的一个功能非常令人印象深刻，例如它会在C ++中告诉您何时使用虚方法的类没有虚拟析构函数。 Easy to check in fact, but really helpful. 事实很容易检查，但真的很有帮助。

The commercial tools are very expensive, but worth the money, once you learned how to use them. 商业工具非常昂贵，但是一旦你学会了如何使用它们就值钱。 A typical problem in the beginning is that you will get a lot of false alarms, and don't know where to look for the real problem. 一开始的典型问题是你会得到很多误报，并且不知道在哪里寻找真正的问题。

Note that nowadays g++ has some of this stuff already built-in, and that you can use something like pclint which is free. 请注意，现在g ++已经内置了一些这样的东西，你可以使用免费的pclint之类的东西。

Sorry - this is already getting quite long...hope it's interesting. 对不起 - 这已经很久了...希望它很有趣。

The term "static analysis" means that the analysis does not actually run a code. 术语“静态分析”意味着分析实际上不运行代码。 On the other hand, "dynamic analysis" runs a code and also requires some kinds of real test inputs. 另一方面，“动态分析”运行代码并且还需要某些类型的实际测试输入。 That is the definition. 这就是定义。 Nothing more. 而已。

Static analysis employs various formal methods such as abstract interpretation, model checking, and symbolic execution. 静态分析采用各种形式方法，如抽象解释，模型检查和符号执行。 In general, abstract interpretation or model checking is suitable for software verification. 通常，抽象解释或模型检查适用于软件验证。 Symbolic execution is more appropriate for the purpose of bug finding. 符号执行更适合于查找错误的目的。

Symbolic execution is categorized into static analysis. 象征性的执行分为静态分析。 However, there is a hybrid method called concolic execution which uses both symbolic execution and dynamic testing. 但是，有一种称为concolic execution的混合方法，它使用符号执行和动态测试。

Added for Zane's comment: 为Zane的评论添加：

Maybe my explanation was little confusing. 也许我的解释有点混乱。

The difference between software verification and bug finding is whether the analysis is sound or not. 软件验证和错误发现之间的区别在于分析是否合理。 For example, when we say the buffer overrun analyzer is sound, it means that the analyzer must report all possible buffer overruns. 例如，当我们说缓冲区溢出分析仪是声音时，这意味着分析仪必须报告所有可能的缓冲区溢出。 If the analyzer reports nothing, it proves the absence of buffer overruns in the target program. 如果分析器没有报告任何内容，则证明目标程序中没有缓冲区溢出。 Because model checking is the method that guarantees soundness, it is mostly used for software verification. 因为模型检查是保证健全性的方法，所以它主要用于软件验证。

On the other hands, symbolic execution which is actively used by today's most commercial static analyzers does not guarantee soundness since sound analysis inherently issues lots, lots of false positives. 另一方面，当今最商业化的静态分析仪主动使用的符号执行并不能保证稳健性，因为声音分析固有地发布了大量的误报。 For the purpose of bug finding, it is more important to reduce false positives even if some true positives are also lost. 为了发现错误，即使某些真阳性也丢失，减少误报也更为重要。

In summary, 综上所述，

soundness: there are no false negatives 健全：没有假阴性
completeness: there are no false positives 完整性：没有误报
software verification: soundness is more important than completeness 软件验证：健全性比完整性更重要
bug finding: completeness is more important than soundness 发现错误：完整性比健全更重要