简体   繁体   English

如何构建静态代码分析工具?

[英]How to build a static code analysis tool?

I m in process of understanding and building a static code analysis tool for a proprietary language from a big company. 我正在为大公司的专有语言理解和构建静态代码分析工具。 Reason for doing this , I have to review a rather large code base , and a static code analysis would help a lot and they do not have one for the language so far. 这样做的原因,我必须检查一个相当大的代码库,静态代码分析将有很多帮助,到目前为止他们没有一个语言。

I would like to know how does one go about building a static code analysis tool , for eg Lint or SpLint for C. 我想知道如何构建静态代码分析工具,例如Lint或SpLint for C.

Any books, articles , blogs , sites..etc would help. 任何书籍,文章,博客,网站......都会有所帮助。

Thanks. 谢谢。

I know this is an old post, but the answers don't really seem that satisfactory. 我知道这是一个老帖子,但答案看起来并不那么令人满意。 This article is a pretty good introduction to the technology behind the static analysis tools, and has several links to examples. 本文非常好地介绍了静态分析工具背后的技术,并提供了几个示例链接。

A good book is "Secure Programming with Static Analysis" by Brian Chest and Jacob West. 一本好书是Brian Chest和Jacob West的“使用静态分析进行安全编程”。

You need good infrastructrure, such as a parser, a tree builder, tree analyzers, symbol table builders, flow analyzers, and then to get on with your specific task you need to code specific checks for the specific problems of interest to you, using all the infrastructure machinery. 您需要良好的基础架构,例如解析器,树构建器,树分析器,符号表构建器,流分析器,然后继续执行您需要的特定任务,以便针对您感兴趣的特定问题进行特定检查,使用所有基础设施机械。

Building all that foundation machinery is actually pretty hard, and it doesn't help you do your specific task. 构建所有基础机械实际上非常困难,并且它无助于您完成特定任务。 People don't write the operating system for every application they code; 人们不会为他们编写的每个应用程序编写操作系统; why should you build all the infrastructure? 为什么要建立所有基础设施? Like an OS, it is better if you simply acquire good infrastructure. 像操作系统一样,如果只是获得良好的基础设施,那就更好了。

People will tell you to lex and yacc. 人们会告诉你lex和yacc。 That's kind of like suggesting you use the real time keneral part of the OS; 这有点像建议你使用操作系统的实时keneral部分; useful, but far from all the infrastructure you really need. 有用,但远离你真正需要的所有基础设施。

Our DMS Software Reengineering Toolkit provides all the necessary infracture. 我们的DMS软件再造工具包提供了所有必要的连接。 It has been used to define many language front ends as well as many tools for such languages. 它已被用于定义许多语言前端以及这些语言的许多工具

Such infrastructure would allow you to define your specific nonstandard language relatively quickly, and then get on with your task of coding your special checks. 这样的基础结构允许您相对快速地定义特定的非标准语言,然后继续执行编写特殊检查的任务。

  1. Obviously you need a parser for the language. 显然你需要一个语言解析器。 A good high level AST is useful. 一个好的高水平AST很有用。
  2. You need to enumerate a set of "mistakes" in the language. 你需要在语言中列举一组“错误”。 Without knowing more about the language in question, we can't help here. 如果不了解有关语言的更多信息,我们在这里无能为力。 Examples: unallocated pointers in C, etc. 示例:C中未分配的指针等。
  3. Combine the AST with the mistakes in #2. 将AST与#2中的错误结合起来。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM