简体   繁体   English

C ++和PHP与C#和Java - 结果不一致

[英]C++ and PHP vs C# and Java - unequal results

I found something a little strange in C# and Java. 我在C#和Java中发现了一些奇怪的东西。 Let's look at this C++ code: 我们来看看这个C ++代码:

#include <iostream>
using namespace std;

class Simple
{
public:
    static int f()
    {
        X = X + 10;
        return 1;
    }

    static int X;
};
int Simple::X = 0;

int main() {
    Simple::X += Simple::f();
    printf("X = %d", Simple::X);
    return 0;
}

In a console you will see X = 11 ( Look at the result here - IdeOne C++ ). 在控制台中,您将看到X = 11( 在此处查看结果 - IdeOne C ++ )。

Now let's look at the same code on C#: 现在让我们看一下C#上的相同代码:

class Program
{
    static int x = 0;

    static int f()
    {
        x = x + 10;
        return 1;
    }

    public static void Main()
    {
        x += f();
        System.Console.WriteLine(x);
    }
}

In a console you will see 1 (not 11!) (look at the result here - IdeOne C# I know what you thinking now - "How that is possible?", but let's go to the following code. 在一个控制台中你会看到1(不是11!)(看看这里的结果 - IdeOne C#我知道你现在在想什么 - “怎么可能?”,但是让我们看看下面的代码。

Java code: Java代码:

import java.util.*;
import java.lang.*;
import java.io.*;

/* Name of the class has to be "Main" only if the class is public. */
class Ideone
{
    static int X = 0;
    static int f()
    {
        X = X + 10;
        return 1;
    }
    public static void main (String[] args) throws java.lang.Exception
    {
        Formatter f = new Formatter();
        f.format("X = %d", X += f());
        System.out.println(f.toString());
    }
}

Result the same as in C# (X = 1, look at the result here ). 结果与C#相同(X = 1,请查看此处的结果)。

And for the last time let's look at the PHP code: 最后一次让我们来看看PHP代码:

<?php
class Simple
{
    public static $X = 0;

    public static function f()
    {
        self::$X = self::$X + 10;
        return 1;
    }
}

$simple = new Simple();
echo "X = " . $simple::$X += $simple::f();
?>

Result is 11 (look at the result here ). 结果是11(看看结果在这里 )。

I have a little theory - these languages (C# and Java) are making a local copy of static variable X on the stack (are they ignoring the static keyword?). 我有一点理论 - 这些语言(C#和Java)正在堆栈上制作静态变量X的本地副本(他们忽略了静态关键字吗?)。 And that is reason why result in those languages is 1. 这就是为什么这些语言的结果是1的原因。

Is somebody here, who have other versions? 有人在这,还有其他版本吗?

The C++ standard states: C ++标准规定:

With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation. 对于不确定序列的函数调用,复合赋值的操作是单个评估。 [ Note: Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single compound assignment operator. [注意:因此,函数调用不应介入左值到右值的转换和与任何单个复合赋值运算符相关的副作用。 —end note ] - 尾注]

§5.17 [expr.ass] §5.17[expr.ass]

Hence, as in the same evaluation you use X and a function with a side effect on X , the result is undefined, because: 因此,在同一评估中,您使用X和对X有副作用的函数,结果是未定义的,因为:

If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. 如果对标量对象的副作用相对于同一标量对象的另一个副作用或使用相同标量对象的值进行的值计算未被排序,则行为未定义。

§1.9 [intro.execution] §1.9[intro.execution]

It happens to be 11 on many compilers, but there is no guarantee that a C++ compiler won't give you 1 as for the other languages. 它在许多编译器上恰好是11,但不能保证C ++编译器不会像其他语言那样给你1。

If you're still skeptical, another analysis of the standard leads to the same conclusion: THe standard also says in the same section as above: 如果你仍然持怀疑态度,对标准的另一个分析得出相同的结论:标准也在上面的同一部分中说:

The behavior of an expression of the form E1 op = E2 is equivalent to E1 = E1 op E2 except that E1 is evaluated only once. E1 op = E2形式的表达式的行为等同于E1 = E1 op E2除了E1仅被评估一次。

In you case X = X + f() except that X is evaluated only once. 在你的情况下, X = X + f()只是X只被评估一次。
As there is no guarantee on the order of evaluation, in X + f() , you cannot take for granted that first f is evaluated and then X . 由于无法保证评估顺序,在X + f() ,您不能理所当然地认为第一个f被评估,然后是X

Addendum 附录

I'm not a Java expert, but the Java rules clearly specify the order of evaluation in an expression, which is guaranteed to be from left to right in section 15.7 of Java Language Specifications . 我不是Java专家,但Java规则明确指定了表达式中的评估顺序,在Java语言规范的第15.7节中保证从左到右。 In section 15.26.2. 在第15.26.2 Compound Assignment Operators the Java specs also say that E1 op= E2 is equivalent to E1 = (T) ((E1) op (E2)) . 复合赋值运算符 Java规范还说E1 op= E2等价于E1 = (T) ((E1) op (E2))

In your Java program this means again that your expression is equivalent to X = X + f() and first X is evaluated, then f() . 在您的Java程序中,这再次意味着您的表达式等效于X = X + f()并且首先计算X ,然后计算f() So the side effect of f() is not taken into account in the result. 因此,结果中不考虑f()的副作用。

So your Java compiler doesn't have a bug. 所以你的Java编译器没有bug。 It just complies with the specifications. 它符合规格。

Thanks to comments by Deduplicator and user694733, here is a modified version of my original answer. 感谢Deduplicator和user694733的评论,这是我原始答案的修改版本。


The C++ version has C ++版本有 undefined 未定义 unspecified behaviour. 未指明的行为。

There is a subtle difference between "undefined" and "unspecified", in that the former allows a program to do anything (including crashing) whereas the latter allows it to choose from a set of particular allowed behaviours without dictating which choice is correct. “未定义”和“未指定”之间存在细微差别,前者允许程序执行任何操作 (包括崩溃),而后者允许它从一组特定的允许行为中进行选择,而无需指示哪个选项是正确的。

Except of very rare cases, you will always want to avoid both. 除了非常罕见的情况,你总是希望避免这两种情况。


A good starting point to understand whole issue are the C++ FAQs Why do some people think x = ++y + y++ is bad? 理解整个问题的一个很好的起点是C ++ FAQs 为什么有些人认为x = ++ y + y ++是坏的? , What's the value of i++ + i++? i ++ + i ++的价值是什么? and What's the deal with “sequence points”? 什么是“序列点”的处理? :

Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. 在前一个和下一个序列点之间,标量对象应通过表达式的计算最多修改其存储值一次

(...) (......)

Basically, in C and C++, if you read a variable twice in an expression where you also write it, the result is undefined . 基本上,在C和C ++中,如果在一个表达式中读取变量两次,同时也写入它,结果是未定义的

(...) (......)

At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. 在称为序列点的执行序列中的某些特定点处,先前评估的所有副作用应该是完整的,并且不会发生后续评估的副作用。 (...) The “certain specified points” that are called sequence points are (...) after evaluation of all a function's parameters but before the first expression within the function is executed. (...) 在评估所有函数的参数之后但在执行函数中的第一个表达式之前,称为序列点的“某些指定点” (...)。

In short, modifying a variable twice between two consecutive sequence points yields undefined behaviour, but a function call introduces an intermediate sequence point (actually, two intermediate sequence points, because the return statement creates another one). 简而言之,在两个连续序列点之间修改变量两次会产生未定义的行为,但函数调用会引入一个中间序列点(实际上,两个中间序列点,因为return语句会创建另一个序列点)。

This means the fact that you have a function call in your expression "saves" your Simple::X += Simple::f(); 这意味着你在表达式中有一个函数调用“保存”你的Simple::X += Simple::f(); line from being undefined and turns it into "only" unspecified. 来自未定义的行并将其变为“仅”未指定。

Both 1 and 11 are possible and correct outcomes, whereas printing 123, crashing or sending an insulting e-mail to your boss are not allowed behaviours; 1和11都是可能和正确的结果,而打印123,崩溃或向您的老板发送侮辱性电子邮件是不允许的行为; you'll just never get a guarantee whether 1 or 11 will be printed. 你永远不会得到保证是否会打印1或11。


The following example is slightly different. 以下示例略有不同。 It's seemingly a simplification of the original code but really serves to highlight the difference between undefined and unspecified behaviour: 它似乎是对原始代码的简化,但实际上用于突出未定义和未指定行为之间的区别:

#include <iostream>

int main() {
    int x = 0;
    x += (x += 10, 1);
    std::cout << x << "\n";
}

Here the behaviour is indeed undefined, because the function call has gone away, so both modifications of x occur between two consecutive sequence points. 这里的行为确实是未定义的,因为函数调用已经消失,因此x两个修改都发生在两个连续的序列点之间。 The compiler is allowed by the C++ language specification to create a program which prints 123, crashes or sends an insulting e-mail to your boss. C ++语言规范允许编译器创建一个程序,该程序打印123,崩溃或向您的老板发送一封侮辱性的电子邮件。

(The e-mail thing of course is just a very common humorous attempt at explaining how undefined really means anything goes . Crashes are often a more realistic result of undefined behaviour.) (当然,电子邮件的方式就是在解释如何不确定到底意味着任何事情都会发生一个很常见的幽默尝试。崩溃往往是不确定的行为更现实的结果。)

In fact, the , 1 (just like the return statement in your original code) is a red herring. 实际上, , 1 (就像原始代码中的return语句一样)是一个红色的鲱鱼。 The following yields undefined behaviour, too: 以下结果也会产生未定义的行为:

#include <iostream>

int main() {
    int x = 0;
    x += (x += 10);
    std::cout << x << "\n";
}

This may print 20 (it does so on my machine with VC++ 2013) but the behaviour is still undefined. 可能会打印20(它在我的机器上用VC ++ 2013打印),但行为仍未定义。

(Note: this applies to built-in operators. Operator overloading changes the behaviour back to specified , because overloaded operators copy the syntax from the built-in ones but have the semantics of functions, which means that an overloaded += operator of a custom type that appears in an expression is actually a function call . Therefore, not only are sequence points introduced but the entire ambiguity goes away, the expression becoming equivalent to x.operator+=(x.operator+=(10)); , which has guaranteed order of argument evaluation. This is probably irrelevant to your question but should be mentioned anyway.) (注意:这适用于内置运算符。运算符重载将行为更改回指定的 ,因为重载运算符从内置运算符复制语法但具有函数的语义 ,这意味着自定义的重载+=运算符表达式中出现的类型实际上是一个函数调用 。因此,不仅引入了序列点,而且整个歧义消失了,表达式变得等同于x.operator+=(x.operator+=(10));这保证了论证评估的顺序。这可能与你的问题无关,但无论如何都应该提到。)

In contrast, the Java version 相比之下,Java版本

import java.io.*;

class Ideone
{
    public static void main(String[] args)
    {
        int x = 0;
        x += (x += 10);
        System.out.println(x);
    }
}

must print 10. This is because Java has neither undefined nor unspecified behaviour with regards to evaluation order. 必须打印10.这是因为Java在评估顺序方面既没有未定义也没有未指定的行为。 There are no sequence points to be concerned about. 没有要关注的序列点。 See Java Language Specification 15.7. 请参阅Java语言规范15.7。 Evaluation Order : 评估顺序

The Java programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right. Java编程语言保证运算符的操作数似乎以特定的评估顺序进行评估,即从左到右。

So in the Java case, x += (x += 10) , interpreted from left to right, means that first something is added to 0 , and that something is 0 + 10 . 所以在Java情况下, x += (x += 10) ,从左到右解释,意味着第一个东西被添加到0 ,并且某个东西是0 + 10 Hence 0 + (0 + 10) = 10 . 因此0 +(0 + 10)= 10

See also example 15.7.1-2 in the Java specification. 另请参阅Java规范中的示例15.7.1-2。

Going back to your original example, this also means that the more complex example with the static variable has defined and specified behaviour in Java. 回到原始示例,这也意味着静态变量的更复杂示例已在Java中定义和指定行为。


Honestly, I don't know about C# and PHP but I would guess that both of them have some guaranteed evaluation order as well. 老实说,我不知道C#和PHP,但我猜他们都有一些保证评估顺序。 C++, unlike most other programming languages (but like C) tends to allow much more undefined and unspecified behaviour than other languages. 与大多数其他编程语言(但与C一样)不同,C ++倾向于允许比其他语言更多未定义和未指定的行为。 That's not good or bad. 这不是好事或坏事。 It's a tradeoff between robustness and efficiency . 这是稳健性和效率之间权衡 Choosing the right programming language for a particular task or project is always a matter of analysing tradeoffs. 为特定任务或项目选择正确的编程语言始终是分析权衡的问题。

In any case, expressions with such side effects are bad programming style in all four languages . 无论如何,具有这种副作用的表达式在所有四种语言中都是糟糕的编程风格

One final word: 最后一句话:

I found a little bug in C# and Java. 我在C#和Java中发现了一个小错误。

You should not assume to find bugs in language specifications or compilers if you don't have many years of professional experience as a software engineer. 如果您没有多年的软件工程师专业经验,则不应该假设在语言规范编译器中发现错误。

As Christophe has already written, this is basically an undefined operation. 正如Christophe已经写过的那样,这基本上是一个未定义的操作。

So why does C++ and PHP does it one way, and C# and Java the other way? 那么为什么C ++和PHP会采用单向方式,而C#和Java则采用其他方式呢?

In this case (which may be different for different compilers and platforms), the order of evaluation of arguments in C++ is inverted compared to C# - C# evaluates arguments in order of writing, while the C++ sample does it the other way around. 在这种情况下(对于不同的编译器和平台可能会有所不同),与C#相比,C ++中参数的评估顺序被反转 - C#按写入顺序计算参数,而C ++示例则反过来。 This boils down to the default calling conventions both use, but again - for C++, this is an undefined operation, so it may differ based on other conditions. 这归结为默认的调用约定都使用,但是再次 - 对于C ++,这是一个未定义的操作,因此它可能因其他条件而不同。

To illustrate, this C# code: 为了说明,这个C#代码:

class Program
{
    static int x = 0;

    static int f()
    {
        x = x + 10;
        return 1;
    }

    public static void Main()
    {
        x = f() + x;
        System.Console.WriteLine(x);
    }
}

Will produce 11 on output, rather than 1 . 将在输出上产生11 ,而不是1

That's simply because C# evaluates "in order", so in your example, it first reads x and then calls f() , while in mine, it first calls f() and then reads x . 这只是因为C#按顺序评估,所以在你的例子中,它首先读取x然后调用f() ,而在我的例子中,它首先调用f()然后读取x

Now, this still might be unrealiable. 现在,这仍然是不可能的。 IL (.NET's bytecode) has + as pretty much any other method, but optimizations by the JIT compiler might result in a different order of evaluation. IL(.NET的字节码)与任何其他方法一样具有+ ,但JIT编译器的优化可能会导致不同的评估顺序。 On the other hand, since C# (and .NET) does define the order of evaluation / execution, so I guess a compliant compiler should always produce this result. 另一方面,由于C#(和.NET) 确实定义了评估/执行的顺序,所以我猜一个兼容的编译器应该总是产生这个结果。

In any case, that's a lovely unexpected outcome you've found, and a cautionary tale - side-effects in methods can be a problem even in imperative languages :) 在任何情况下,这是你发现的一个可爱的意外结果,并且一个警示故事 - 方法中的副作用即使在命令式语言中也是一个问题:)

Oh, and of course - static means something different in C# vs. C++. 哦,当然 - static意味着C#与C ++不同。 I've seen that mistake made by C++ers coming to C# before. 我已经看到C ++ ers之前犯过的错误。

EDIT : 编辑

Let me just expand a bit on the "different languages" issue. 让我稍微谈谈“不同语言”问题。 You've automatically assumed, that C++'s result is the correct one, because when you're doing the calculation manually, you're doing the evaluation in a certain order - and you've determined this order to comply with the results from C++. 您自动假设C ++的结果是正确的,因为当您手动进行计算时,您正在按特定顺序进行评估 - 并且您已确定此顺序符合C ++的结果。 However, neither C++ nor C# do analysis on the expression - it's simply a bunch of operations over some values. 但是,C ++和C#都没有对表达式进行分析 - 它只是对一些值的一堆操作。

C++ does store x in a register, just like C#. C ++ 确实x存储在寄存器中,就像C#一样。 It's just that C# stores it before evaluating the method call, while C++ does it after . 只是C# 评估方法调用之前存储它,而C ++在之后执行它。 If you change the C++ code to do x = f() + x instead, just like I've done in C#, I expect you'll get the 1 on output. 如果你改变C ++代码来代替x = f() + x ,就像我在C#中所做的那样,我希望你的输出会得到1

The most important part is that C++ (and C) simply didn't specify an explicit order of operations, probably because it wanted to exploit architectures and platforms that do either one of those orders. 最重要的部分是C ++(和C)根本没有指定明确的操作顺序,可能是因为它想要利用执行这些命令之一的架构和平台。 Since C# and Java were developed in a time when this doesn't really matter anymore, and since they could learn from all those failures of C/C++, they specified an explicit order of evaluation. 由于C#和Java是在不再重要的时候开发的,并且由于他们可以从C / C ++的所有失败中学习,因此他们指定了明确的评估顺序。

According to the Java language specification: 根据Java语言规范:

JLS 15.26.2, Compound Assignment Operators JLS 15.26.2,复合赋值运算符

A compound assignment expression of the form E1 op= E2 is equivalent to E1 = (T) ((E1) op (E2)) , where T is the type of E1 , except that E 1 is evaluated only once. 形式E1 op= E2的复合赋值表达式等效于E1 = (T) ((E1) op (E2)) ,其中TE1的类型,除了E 1仅被评估一次。

This small program demonstrates the difference, and exhibits expected behavior based on this standard. 这个小程序展示了差异,并展示了基于该标准的预期行为。

public class Start
{
    int X = 0;
    int f()
    {
        X = X + 10;
        return 1;
    }
    public static void main (String[] args) throws java.lang.Exception
    {
        Start actualStart = new Start();
        Start expectedStart = new Start();
        int actual = actualStart.X += actualStart.f();
        int expected = (int)(expectedStart.X + expectedStart.f());
        int diff = (int)(expectedStart.f() + expectedStart.X);
        System.out.println(actual == expected);
        System.out.println(actual == diff);
    }
}

In order, 为了,

  1. actual is assigned to value of actualStart.X += actualStart.f() . actual被赋值为actualStart.X += actualStart.f()
  2. expected is assigned to the value of the expected被分配给的值
  3. result of retrieving actualStart.X , which is 0 , and 检索actualStart.X结果,即0 ,和
  4. applying the addition operator to actualStart.X with 将加法运算符应用于actualStart.X
  5. the return value of invoking actualStart.f() , which is 1 调用actualStart.f()的返回值,即1
  6. and assigning the result of 0 + 1 to expected . 并将0 + 1的结果分配给expected

I also declared diff to show how changing the order of invocation changes the result. 我还声明了diff以显示更改调用顺序如何更改结果。

  1. diff is assigned to value of the diff分配给的值
  2. the return value of invoking diffStart.f() , with is 1 , and 调用diffStart.f()的返回值,为1 ,和
  3. applying the addition operator to that value with 将加法运算符应用于该值
  4. the value of diffStart.X (which is 10, a side effect of diffStart.f() diffStart.X的值(10, diffStart.f()diffStart.f()
  5. and assigning the result of 1 + 10 to diff . 并将1 + 10的结果分配给diff

In Java, this is not undefined behavior. 在Java中,这不是未定义的行为。

Edit: 编辑:

To address your point regarding local copies of variables. 解决关于变量的本地副本的观点。 That is correct, but it has nothing to do with static . 这是正确的,但它与static无关。 Java saves the result of evaluating each side (left side first), then evaluates result of performing the operator on the saved values. Java保存评估每一侧的结果(左侧第一个),然后评估对保存的值执行运算符的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM