简体   繁体   中英

Fastest way to copy one dimension of 2D vector in c++

I have a 2D vector which im using for complex numbers. Just for example:

vector<vector<double>> Complex;
vector<double> ComplexNumber;
ComplexNumber.push_back(5);  // real part
ComplexNumber.push_back(-4); // imag part
Complex.push_back(ComplexNumber); // Complex[i][0] - real part, [i][1] - imag

In depth of my code i need to pull out some part of my Complex vector in to other. Like, copy from index 10 to 18 real part in some variable (1D vector) and copy from index 10 to 18 imag part in some other variable (1D vector). Currently im doing this with for cycle:

for (int j=0; j<=Samples; j++)
{
  refRealSignal[j] = ReferenseComplexSignalsSampled[(i*SignalSampleIndex)+j][0] ;
  refImagSignal[j] = ReferenseComplexSignalsSampled[(i*SignalSampleIndex)+j][1] ;
}

This code is the bottleneck of the entire program as profiler shows. Is there any way to improve it?

Small update: "Sample" variable is an int from 8 to 20, usually 8. Variable i comes from an outer for loop.

Big update: So, i put out 2D vector and rewrite everything with complex class. Also i rewrite my mul operation in "for" cycle. I do not know why, but copying from complex.imag takes more time (more by 2) then from complex.real part. After all of this perfomance of code increased from ~5 ms for one sample to ~1.8 ms for one sample. (2.5 ms after i rewrite mul operation and also rewrite entire cycle, this was a very helpfull advice, thanks a lot)

If Samples is big, you could save some multiplications regarding i . So change this:

for (int j=0; j<=Samples; j++)
{
  refRealSignal[j] = ReferenseComplexSignalsSampled[(i*SignalSampleIndex)+j][0] ;
  refImagSignal[j] = ReferenseComplexSignalsSampled[(i*SignalSampleIndex)+j][1] ;
}

to this:

int index;
for(i = ..) {                    // assuming your code has a for loop for i
  index = i*SignalSampleIndex;
  for (int j=0; j<=Samples; ++j) // change the ++ as pre-fix
  {
    refRealSignal[j] = ReferenseComplexSignalsSampled[index+j][0] ;
    refImagSignal[j] = ReferenseComplexSignalsSampled[index+j][1] ;
  }
}

That way you do 1 multiplication, instead of 2 * Samples , as luk32 noticed.

Another approach, as discussed in the comments, you could use a class for representing your complex number. STL provides a class for that: std::complex .

Then you would have a vector with type of std::complex and that would keep your data more robust, which might improve locality , which caching shall take advantage of.

You could do something like this:

#include <iostream>     // std::cout
#include <complex>      // std::complex, std::real
#include <vector>   // std::vector

int main ()
{
  std::vector<std::complex<double> >complex;

  // if you know the amount of your numbers,
  // use a reserve(). Assuming you will insert
  // 100000 numbers, the code would be
  complex.reserve(100000);

  for(int i = 0; i < 100000; ++i)
      complex[i] = {0.1, 0.2};

  std::cout << "Real part of 1st element: " << std::real(complex[0]) << '\n';

  return 0;
}

[EDIT]

The multiplications issue is possible to be performed by the compiler, by using an optimization flag. Make sure that you profile your code, when it's compiled with an optimization flag.

Tip :

Usually if a section is slowing your program down, there are two approaches: (1) make that section faster, or (2) find a way to do that section less often.

(credits to Psyduck, aka Mooling duck)

In your case, you can tried what I suggested above to make your code faster, but if you would think again your logic and avoid/decrease the times that you copy, then would be rewarded with a boost in the performance.

Using an std::vector<double> for complex numbers is a huge mistake wrt. performance. Why? For several reasons:

  • Allocation takes forever. Typical values are somewhere upwards from 200 ns.

  • Memory is allocated on the heap. The overhead in terms of space is huge.

    • Typical overhead within the memory allocator: two pointers, ie 8 or 16 bytes, depending on your architecture.

    • Overhead of the std::vector<> itself: two pointers, another 8 or 16 bytes.

    • Overallocation of the std::vector<> : Typical implementations never allocate memory for only two elements. I would estimate this overhead to at least six elements (eight elements minimal allocation). That would cause an overhead of 48 bytes.

    So, you end up using somewhat like 80 bytes to store something that would fit into 16.

    This matters, because it means your caches / memory bus have to do five times the work!

  • Memory is allocated on the heap. That means your complex numbers are likely scattered. This is another blow to cache efficiency.

If you want to be fast, use either arrays with two elements (doesn't matter if you use C-style arrays or C++ std::array<> ) or define your complex type as a plain old data struct . All three options have the same memory layout, and thus should be equivalent in performance. But I would prefer the struct approach since it allows you to overload the operators which is nice for mathematical types like complex numbers, vectors, quaternions, and such.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM