简体   繁体   English

C 与 Java 中的表达式计算

[英]Expression evaluation in C vs Java

int y=3;
int z=(--y) + (y=10);

when executed in C language the value of z evaluates to 20 but when the same expression in java, when executed gives the z value as 12.在 C 语言中执行时, z值为 20,但在 java 中执行相同的表达式时, z值为 12。

Can anyone explain why this is happening and what is the difference?谁能解释为什么会发生这种情况以及有什么区别?

when executed in C language the value of z evaluates to 20在 C 语言中执行时,z 的值为 20

No it does not.不,不是的。 This is undefined behavior, so z could get any value.这是未定义的行为,因此z可以获得任何值。 Including 20. The program could also theoretically do anything, since the standard does not say what the program should do when encountering undefined behavior.包括20。程序理论上也可以做任何事情,因为标准没有说明程序遇到未定义行为时应该做什么。 Read more here: Undefined, unspecified and implementation-defined behavior在此处阅读更多信息:未定义、未指定和实现定义的行为

As a rule of thumb, never modify a variable twice in the same expression.根据经验,不要在同一个表达式中两次修改变量。

It's not a good duplicate, but this will explain things a bit deeper.这不是一个好的副本,但这会更深入地解释事情。 The reason for undefined behavior here is sequence points.此处未定义行为的原因是序列点。 Why are these constructs using pre and post-increment undefined behavior?为什么这些构造使用前后增量未定义行为?

In C, when it comes to arithmetic operators, like + and / , the order of evaluation of the operands is not specified in the standard, so if the evaluation of those has side effects, your program becomes unpredictable.在 C 中,当涉及算术运算符时,例如+/ ,标准中未指定操作数的求值顺序,因此如果对这些求值有副作用,您的程序将变得不可预测。 Here is an example:下面是一个例子:

int foo(void)
{
    printf("foo()\n");
    return 0;
}

int bar(void)
{
    printf("bar()\n");
    return 0;
}

int main(void)
{
    int x = foo() + bar();
}

What will this program print?这个程序会打印什么? Well, we don't know.好吧,我们不知道。 I'm not entirely sure if this snippet invokes undefined behavior or not, but regardless, the output is not predictable.我不完全确定这个片段是否会调用未定义的行为,但无论如何,输出是不可预测的。 I made a question, Is it undefined behavior to use functions with side effects in an unspecified order?我提出了一个问题,以未指定的顺序使用具有副作用的函数是否是未定义的行为? , about that, so I'll update this answer later. ,关于那个,所以我稍后会更新这个答案。

Some other variables have specified order (left to right) of evaluation, like ||其他一些变量具有指定的评估顺序(从左到右),例如|| and && and this feature is used for short circuiting .&&并且此功能用于短路 For instance, if we use the above example functions and use foo() && bar() , only the foo() function will be executed.例如,如果我们使用上面的示例函数并使用foo() && bar() ,则只会执行foo()函数。

I'm not very proficient in Java, but for completeness, I want to mention that Java basically does not have undefined or unspecified behavior except for very special situations.我对Java不是很精通,但为了完整性,我想提一下,除了非常特殊的情况,Java基本上没有未定义或未指定的行为。 Almost everything in Java is well defined. Java 中的几乎所有内容都定义良好。 For more details, read rzwitserloot's answer有关更多详细信息,请阅读rzwitserloot 的回答

There are 3 parts to this answer:这个答案有 3 个部分:

  1. How this works in C (unspecified behaviour)这在 C 中是如何工作的(未指定的行为)
  2. How this works in Java (the spec is clear on how this should be evaluated)这在 Java 中是如何工作的(规范清楚地说明了应该如何评估)
  3. Why is there a difference.为什么有区别。

For #1, you should read @klutt's fantastic answer.对于#1,您应该阅读@klutt 的精彩回答。

For #2 and #3, you should read this answer.对于#2 和#3,您应该阅读此答案。

How does it work in java?它在java中是如何工作的?

Unlike in C, java's language specification is far more clearly specified.与 C 不同,java 的语言规范被更明确地指定。 For example, C doesn't even tell you how many bits the data type int is supposed to have, whereas the java lang spec does: 32 bits.例如,C 甚至没有告诉您数据类型int应该有多少位,而 java lang 规范则告诉您:32 位。 Even on 64-bit processors and a 64-bit java implementation.即使在 64 位处理器和 64 位 java 实现上。

The java spec clearly says that x+y is to be evaluated left-to-right (vs. C's 'in any order you please, compiler'), thus, first --y is evaluated which is clearly 2 (with the side-effect of making y 2), and then y=10 is evaluated which is clearly 10 (with the side effect of making y 10), and then 2+10 is evaluated which is clearly 12. Java 规范清楚地表明x+y是从左到右求值的(相对于 C 的“按你喜欢的任何顺序,编译器”),因此,首先--y被求值,这显然是 2(侧面- y 2 的效果),然后评估y=10显然是 10(带有使 y 10 的副作用),然后评估2+10显然是 12。

Obviously, a language like java is just better;显然,像java这样的语言更好; after all, undefined behaviour is pretty much a bug by definition, whatever was wrong with the C lang spec writers to introduce this crazy stuff?毕竟,根据定义,未定义的行为几乎是一个错误,C lang 规范编写者引入这些疯狂的东西有什么问题吗?

The answer is: performance.答案是:性能。

In C, your source code is turned into machine code by the compiler, and the machine code is then interpreted by the CPU.在 C 中,您的源代码由编译器转换为机器代码,然后由 CPU 解释机器代码。 A 2-step model.一个两步模型。

In java, your source code is turned into bytecode by the compiler, the bytecode is then turned into machine code by the runtime, and the machine code is then interpreted by the CPU.在java中,你的源代码被编译器转换成字节码,字节码然后被运行时转换成机器码,然后机器码被CPU解释。 A 3-step model.一个 3 步模型。

If you want to introduce optimizations, you don't control what the CPU does, so for C there is only 1 step where it can be done: Compilation.如果要引入优化,则无法控制 CPU 做什么,因此对于 C,可以完成的步骤只有 1 个:编译。

So C (the language) is designed to give lots of freedom to C compilers to attempt to produce optimized machine code.因此,C(语言)旨在为 C 编译器提供大量自由,以尝试生成优化的机器代码。 This is a cost/benefit scenario: At the cost of having a ton of 'undefined behaviour' in the lang spec, you get the benefit of better optimizing compilers.这是一个成本/收益方案:以在 lang 规范中有大量“未定义行为”为代价,您可以获得更好的优化编译器的好处。

In java, you get a second step, and that's where java does its optimizations: At runtime.在 Java 中,您有第二步,这就是 Java 进行优化的地方:在运行时。 java.exe does it to class files; java.exe对类文件进行处理; javac.exe is quite 'stupid' and optimizes almost nothing. javac.exe非常“愚蠢”,几乎没有优化。 This is on purpose;这是故意的; at runtime you can do a better job (for example, you can use some bookkeeping to track which of two branches is more commonly taken and thus branch predict better than a C app ever could) - it also means that cost/benefit analysis now results in: The lang spec should be clear as day.在运行时,您可以做得更好(例如,您可以使用一些簿记来跟踪两个分支中的哪一个更常被采用,从而分支预测比 C 应用程序更好) - 这也意味着成本/收益分析现在产生in:lang 规范应该是清晰的。

So java code is never undefined behaviour?所以java代码从来都不是未定义的行为?

Not so.不是这样。 Java has a memory model which includes a ton of undefined behaviour: Java 有一个内存模型,其中包含大量未定义的行为:

class X { int a, b; }
X instance = new X();

new Thread() { public void run() {
    int a = instance.a;
    int b = instance.b;
    instance.a = 5;
    instance.b = 6;
    System.out.print(a);
    System.out.print(b);
}}.start();

new Thread() { public void run() {
    int a = instance.a;
    int b = instance.b;
    instance.a = 1;
    instance.b = 2;
    System.out.print(a);
    System.out.print(b);
}}.start();

is undefined in java.在java中是未定义的。 It may print 0056 , 0012 , 0010 , 0002 , 5600 , 0600 , and many many more possibilities.它可能会打印005600120010000256000600以及更多的可能性。 Something like 5000 (which it could legally print) is hard to imagine: How can the read of a 'work' but the read of b then fail?5000 (它可以合法打印)这样的东西很难想象:读取a 'work' 而读取b怎么会失败?

For the exact same reason your C code produces arbitrary answers:出于完全相同的原因,您的 C 代码会产生任意答案:

Optimization.优化。

The cost/benefit of 'hardcoding' in the spec exactly how this code would behave would have a large cost to it: You'd take away most of the room for optimization.规范中“硬编码”的成本/收益正是此代码的行为方式,这将带来很大的成本:您将占用大部分优化空间。 So java paid the cost and now has a langspec that is ambigous whenever you modify/read the same fields from different threads without establish so-called 'comes-before' guards using eg synchronized .因此,java 付出了代价,现在有了一个 langspec,每当您修改/读取来自不同线程的相同字段时,它都是不明确的,而无需使用例如synchronized建立所谓的“先来”保护。

When executed in C language the value of z evaluates to 20在 C 语言中执行时,z 的值为 20

It is not the truth.这不是事实。 The compiler you use evaluates it to 20 .您使用的编译器将其计算为20 Another one can evaluate it completely different way: https://godbolt.org/z/GcPsKh另一个可以以完全不同的方式对其进行评估: https : //godbolt.org/z/GcPsKh

This kind of behaviour is called Undefined Behaviour.这种行为称为未定义行为。

In your expression you have two problems.你的表达有两个问题。

  1. Order of eveluation (except the logical expressions) is not specified in C (it is an Unspecified Behaviour) C 中未指定评估顺序(逻辑表达式除外)(这是一种未指定的行为)
  2. In this expression there is also problem with the sequence point (Undefined Bahaviour)在这个表达式中,序列点(Undefined Bahaviour)也有问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM