简体   繁体   English

为什么在 C99 之前禁止混合声明和代码?

[英]Why was mixing declarations and code forbidden up until C99?

I have recently become a teaching assistant for a university course which primarily teaches C. The course standardized on C90, mostly due to widespread compiler support.我最近成为一门主要教授 C 的大学课程的助教。该课程在 C90 上标准化,主要是由于广泛的编译器支持。 One of the very confusing concepts to C newbies with previous Java experience is the rule that variable declarations and code may not be intermingled within a block (compound statement).对于具有 Java 经验的 C 新手来说,非常令人困惑的概念之一是变量声明和代码不能在块(复合语句)中混合的规则。

This limitation was finally lifted with C99, but I wonder: does anybody know why it was there in the first place? C99 终于解除了这个限制,但我想知道:有人知道它为什么会存在吗? Does it simplify variable scope analysis?它是否简化了可变范围分析? Does it allow the programmer to specify at which points of program execution the stack should grow for new variables?它是否允许程序员指定堆栈应该在程序执行的哪个点为新变量增长?

I assume the language designers wouldn't have added such a limitation if it had absolutely no purpose at all.我认为如果它完全没有任何目的,语言设计者就不会添加这样的限制。

In the very beginning of C the available memory and CPU resources were really scarce.在 C 的最初阶段,可用内存和 CPU 资源非常稀缺。 So it had to compile really fast with minimal memory requirements.所以它必须以最少的内存需求快速编译。

Therefore the C language has been designed to require only a very simple compiler which compiles fast.因此,C 语言被设计为只需要一个非常简单的编译器即可快速编译。 This in turn lead to " single-pass compiler " concept: The compiler reads the source-file and translates everything into assembler code as soon as possible - usually while reading the source file.这反过来又导致了“单程编译器”的概念:编译器读取源文件并尽快将所有内容转换为汇编代码——通常是在读取源文件时。 For example: When the compiler reads the definition of a global variable the appropriate code is emitted immediately.例如:当编译器读取全局变量的定义时,会立即发出相应的代码。

This trait is visible in C up until today:直到今天,这个特性在 C 中仍然可见:

  • C requires "forward declarations" of all and everything. C 需要所有和所有内容的“前向声明”。 A multi-pass compiler could look forward and deduce the declarations of variables of functions in the same file by itself.多通道编译器可以自行预测并推导出同一文件中函数变量的声明。
  • This in turn makes the *.h files necessary.这反过来又使*.h文件成为必需。
  • When compiling a function, the layout of the stack frame must be computed as soon as possible - otherwise the compiler had to do several passes over the function body.编译函数时,必须尽快计算堆栈帧的布局 - 否则编译器必须对函数体进行多次传递。

Nowadays no serious C compiler is still "single pass", because many important optimizations cannot be done within one pass.现在没有一个严肃的 C 编译器仍然是“single pass”,因为许多重要的优化不能在一次 pass 中完成。 A little bit more can be found in Wikipedia .可以在Wikipedia 中找到更多内容

The standard body lingered for quite some time to relax that "single-pass" point in regard to the function body.标准体徘徊了相当长的一段时间,以放松有关功能体的“单通”点。 I assume, that other things were more important.我想,其他事情更重要。

It was that way because it had always been done that way, it made writing compilers a little easier, and nobody had really thought of doing it any other way.之所以这样,是因为它一直都是这样做的,它使编写编译器变得更容易一些,而且没有人真正想过以其他方式这样做。 In time people realised that it was more important to favour making life easier for language users rather than compiler writers.随着时间的推移,人们意识到更重要的是让语言用户的生活更轻松,而不是编译器编写者。

I assume the language designers wouldn't have added such a limitation if it had absolutely no purpose at all.我认为如果它完全没有任何目的,语言设计者就不会添加这样的限制。

Don't assume that the language designers set out to restrict the language.不要假设语言设计者开始限制语言。 Often restrictions like this arise by chance and circumstance.像这样的限制通常是偶然和环境产生的。

I guess it should be easier for a non-optimising compiler to produce efficient code this way:我想非优化编译器以这种方式生成高效代码应该更容易:

int a;
int b;
int c;
...

Although 3 separate variables are declared, the stack pointer can be incremented at once without optimising strategies such as reordering, etc.虽然声明了 3 个独立的变量,但堆栈指针可以一次递增,无需优化策略,例如重新排序等。

Compare this to:将此与:

int a;
foo();
int b;
bar();
int c;

To increment the stack pointer just once, this requires a kind of optimisation, although not a very advanced one.只增加一次堆栈指针,这需要一种优化,虽然不是非常高级的优化。

Moreover, as a stylistic issue, the first approach encourages a more disciplined way of coding (no wonder that Pascal too enforces this) by being able to see all the local variables at one place and eventually inspect them together as a whole.此外,作为一个文体问题,第一种方法通过能够在一个地方看到所有局部变量并最终将它们作为一个整体一起检查来鼓励更规范的编码方式(难怪 Pascal 也强制执行此操作)。 This provides a clearer separation between code and data.这在代码和数据之间提供了更清晰的分离。

Requiring that variables declarations appear at the start of a compound statement did not impair the expressiveness of C89.要求变量声明出现在复合语句的开头不会影响 C89 的表达能力。 Anything that one could legitimately do using a mid-block declaration could be done just as well by adding an open-brace before the declaration and doubling up the closing brace of the enclosing block.使用中间块声明可以合法地做的任何事情都可以通过在声明之前添加一个左大括号并将封闭块的右大括号加倍来完成。 While such a requirement may sometimes have cluttered source code with extra opening and closing braces, such braces would not have been just noise--they would have marked the beginning and end of variables' scopes.虽然这样的要求有时可能会用额外的左括号和右括号使源代码混乱,但这样的括号不会只是噪音——它们会标记变量范围的开始和结束

Consider the following two code examples:考虑以下两个代码示例:

{
  do_something_1();
  {
    int foo;
    foo = something1();
    if (foo) do_something_1(foo);
  }
  {
    int bar;
    bar = something2();
    if (bar) do_something_2(bar);
  }
  {
    int boz;
    boz = something3();
    if (boz) do_something_3(boz);
  }
}

and

{
  do_something_1();

  int foo;
  foo = something1();
  if (foo) do_something_1(foo);

  int bar;
  bar = something2();
  if (bar) do_something_2(bar);

  int boz;
  boz = something3();
  if (boz) do_something_3(boz);
}

From a run-time perspective, most modern compilers probably wouldn't care about whether foo is syntactically in scope during the execution of do_something3() , since it could determine that any value it held before that statement would not be used after.从运行时的角度来看,大多数现代编译器可能不会关心foodo_something3()执行期间的语法是否在范围内,因为它可以确定在该语句之前它持有的任何值都不会在之后使用。 On the other hand, encouraging programmers to write declarations in a way which would generate sub-optimal code in the absence of an optimizing compiler is hardly an appealing concept.另一方面,鼓励程序员在没有优化编译器的情况下以生成次优代码的方式编写声明并不是一个吸引人的概念。

Further, while handling the simpler cases involving intermixed variable declarations would not be difficult (even a 1970's compiler could have done it, if the authors wanted to allow such constructs), things become more complicated if the block which contains intermixed declarations also contains any goto or case labels.此外,虽然处理涉及混合变量声明的简单情况并不困难(即使是 1970 年的编译器也可以做到,如果作者想要允许这样的构造),如果包含混合声明的块也包含任何goto ,事情就会变得更加复杂或case标签。 The creators of C probably thought allowing intermixing of variable declarations and other statements would complicate the standards too much to be worth the benefit. C 的创建者可能认为允许变量声明和其他语句的混合会使标准过于复杂,不值得从中受益。

Back in the days of C youth, when Dennis Ritchie worked on it, computers ( PDP-11 for example) have very limited memory (eg 64K words), and the compiler had to be small, so it had to optimize very few things and very simply.回到 C 青年时代,Dennis Ritchie 研究它时,计算机(例如PDP-11 )的内存非常有限(例如 64K 字),并且编译器必须很小,因此它必须优化的东西很少,并且很简单。 And at that time (I coded in C on Sun-4 / 110 in the 1986-89 era), declaring register variables was really useful for the compiler.在那个时候(我在 1986-89 时代的Sun-4 / 110上用 C 编码),声明寄存器变量对编译器非常有用。

Today's compilers are much more complex .今天的编译器要复杂得多 For example, a recent version of GCC (4.6) has more 5 or 10 million lines of source code (depending upon how you measure it), and does a big lot of optimizations which did not existed when the first C compilers appeared.例如,最新版本的 GCC (4.6) 有超过 5 或 1000 万行源代码(取决于您如何衡量它),并且进行了大量优化,而这在第一个 C 编译器出现时是不存在的。

And today's processors are also very different (you cannot suppose that today's machines are just like machines from the 1980s, but thousands of times faster and with thousands times more RAM and disk).而且今天的处理器也大不相同(你不能假设今天的机器就像 1980 年代的机器,但速度快了数千倍,RAM 和磁盘也多出数千倍)。 Today, the memory hierarchy is very important: cache misses are what the processor does the most (waiting for data from RAM).今天,内存层次结构非常重要:缓存未命中是处理器最常做的事情(等待来自 RAM 的数据)。 But in the 1980s access to memory was almost as fast (or as slow, by current standards) than execution of a single machine instruction.但是在 1980 年代,对内存的访问几乎与执行单个机器指令一样快(或按照当前标准慢)。 This is completely false today: to read your RAM module, your processor may have to wait for several hundreds of nanoseconds, while for data in L1 cache, it can execute more that one instruction each nanosecond.今天这是完全错误的:要读取您的 RAM 模块,您的处理器可能需要等待数百纳秒,而对于 L1 缓存中的数据,它每纳秒可以执行超过一条指令。

So don't think of C as a language very close to the hardware: this was true in the 1980s, but it is false today.所以不要认为 C 是一种非常接近硬件的语言:这在 1980 年代是正确的,但在今天是错误的。

Oh, but you could (in a way) mix declarations and code, but declaring new variables was limited to the start of a block.哦,但是您可以(在某种程度上)混合声明和代码,但是声明新变量仅限于块的开头。 For example, the following is valid C89 code:例如,以下是有效的C89代码:

void f()
{
  int a;
  do_something();
  {
    int b = do_something_else();
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM