简体   繁体   中英

valarray in-place operation gives different result as a temporary assignment

the following program:

#include<iostream>
#include<valarray>

using namespace std;

int main() {
  int init[] = {1, 1};

  // Example 1
  valarray<int> a(init, 2);
  // In-place assignment
  a[slice(0, 2, 1)] = a[slice(0, 2, 1)] + valarray<int>(a[slice(0, 2, 1)]) * a[0];

  for (int k = 0; k < 2; ++ k) {
    cout << a[k] << ' ';  // Outputs 2 3
  }
  cout << endl;

  // Example 2
  valarray<int> b(init, 2);
  // Temporary assignment
  valarray<int> r = b[slice(0, 2, 1)] + valarray<int>(b[slice(0, 2, 1)]) * b[0];
  b[slice(0, 2, 1)] = r;

  for (int k = 0; k < 2; ++ k) {
    cout << b[k] << ' '; // Outputs 2 2
  }
  cout << endl;
  return 0;
}

outputs:

2 3
2 2

The correct answer is 2 2 ( <1 1> + <1 1> * 1 = <2 2> . Why is the inline version outputting something different?

In case it matters, I'm compiling this way:

g++ myprogram.cpp -o myprogram

And the output of g++ -v is:

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.5' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5) 

This looks like a missing template overload with the older compiler. Template

valarray<T>& operator=( valarray<T>&& other ) noexcept;

exists in VS2017 and in gcc-7.1 but absent in older versions. The expression on the right seems to be evaluated and assigned every iteration. The simplest example demonstrating this:

#include <iostream>
#include <valarray>

int main() 
{

    std::valarray<int> a{2, 4, 8};
    a = a + a[0];

    for (auto n : a) 
        std::cout << n << " ";

    std::cout << std::endl << std::endl;

    return 0;
}

The correct output is:

4 6 10

However older compilers produce

4 8 12

Solution would be to use updated compiler or force copy

a = std::valarray<int>(a + a[0]);

Hope this helps

First, a[slice(0, 2, 1)] has type slice_array<T> , and there is no overload of operator+ taking an slice_array<T> object or reference as parameter.

Note the possible working overload operator+(const valarray<T>&, const valarray<T>&) is a function template, though slice_array<T> can be implicitly converted to valarray<T> , the template argument T cannot be deduced from the slice_array<T> argument.

So strictly speaking, your code will cause a compile error. In fact, Clang does .


Second, you should know there are some optimization techniques for operations of valarray . One well-konwn technique is expression templates , which causes your unexpected results. To see how it works, let's consider a simpler example that reproduces this problem:

valarray<int> a{1, 1};
a = a + a[0];
// now a is {2, 3} while {2, 2} is expected

The key idea of expression templates is to postpone the evaluation of expression until its value is really needed, such that extra temporary is avoided.

In the example above, an optimizer may choose to optimize the result of a + a[0] to be a proxy object instead of a valarray<int> temporary. The proxy object just stores the action (not the result value) of "adding a[0] to a ".

When the proxy object is then assigned to a , actual evaluation occurs. From the stored action, the optimizer will choose to assign a[i] + a[0] to a[i] for each i . Now different evaluation orders in this assignment will result in different results. For example, if the compiler assigns a[0] + a[0] to a[0] , and then assigns a[1] + a[0] (here a[0] is changed to 2) to a[1] , the unexpected result {2, 3} is produced.

The standard allows such proxy object to exist, but it seems not clearly to specify how the proxy object should work. I personally think this is a compiler bug, because simply evaluating a[0] and storing its value before assignment will solve this problem with little performance loss.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM