简体   繁体   English

为什么我们在编译器设计的词法分析中将字符串视为单个标记?

[英]Why we count a string as a single token in lexical analysis of compiler design?

I am learning about compiler design.我正在学习编译器设计。 The task of lexical analyser in compiler is to convert the code into stream of tokes.编译器中词法分析器的任务是将代码转换为令牌的stream。 But I am confused why we consider a string as a single token.但我很困惑为什么我们将字符串视为单个标记。 For example - printf("%d is integer", x);例如 - printf("%d is integer", x); In this statement printf , ( , "%d is integer" , , , x , ) , ;在这个语句中printf , ( , "%d is integer" , , , x , ) , ; are the tokens but why %d in string is not considered a separate token?是标记,但为什么字符串中的%d不被视为单独的标记?

Because format specifiers like %d (or any other string contents) are not syntactically meaningful - there's no element of the language grammar that depends on them.因为像%d (或任何其他字符串内容)这样的格式说明符在语法上没有意义——没有依赖于它们的语言语法元素。 String contents (including format specifiers like %d ) are data , not code, and thus not meaningful to the compiler.字符串内容(包括像%d这样的格式说明符)是数据,而不是代码,因此对编译器没有意义。 The character sequence %d is only meaningful at runtime, and only to the *printf / *scanf families of functions, and only as part of a format string.字符序列%d仅在运行时有意义,并且仅对*printf / *scanf系列函数有意义,并且仅作为格式字符串的一部分。

To recognize %d as a distinct token, you would have to tokenize the entire string - " , %d , is , integer , " .要将%d识别为不同的标记,您必须标记整个字符串- " , %d , is , integer , " That opens up a whole can of worms on its own, making parsing of strings more difficult.这会自行打开一大堆蠕虫,使字符串的解析更加困难。

Some compilers do examine the format string arguments to printf and scanf calls to do some basic sanity checking, but that's well after tokenization has already taken place.一些编译器确实会检查格式字符串 arguments 到printfscanf调用以进行一些基本的健全性检查,但这已经发生在标记化之后。 At the tokenization stage, you don't know that this is a call to the printf library function.在标记化阶段,您不知道这是对printf库 function 的调用。 It's not until after syntax analysis that the compiler knows that this is a specific library call and can perform that kind of check.直到在语法分析之后,编译器才知道这是一个特定的库调用并且可以执行这种检查。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM