简体   繁体   English

当迭代器是类成员时,为什么 C++ 中的 for 循环更慢

[英]Why is a for loop slower in C++ when the iterator is a class member

This question was previously asked by myself though not very well.这个问题以前是我自己问的,虽然不是很好。 I have reworded and expanded on it and I hope I have clarified a couple of points.我已经对其进行了重新措辞和扩展,我希望我已经澄清了几点。 This question is very much from a beginners perspective.这个问题非常从初学者的角度来看。

Below is a short example code which runs inline and as part of a function.下面是一个简短的示例代码,它内联运行并作为函数的一部分。

The code adds 1 to each element of an array and then divides each element by two.该代码将数组的每个元素加 1,然后将每个元素除以 2。 This is arbitrary for the moment and is just a place holder for something more complex but very much equivalent.这暂时是任意的,只是更复杂但非常等效的东西的占位符。

As background this stems audio processing for real-time applications.作为背景,这阻止了实时应用程序的音频处理。 This is the reason for这就是原因

int N = 44.1e3 * 60;

So essentially this will be processing 1 minute of audio sampled at 44.1kHz.所以基本上这将处理以 44.1kHz 采样的 1 分钟音频。

The explicit code written inline runs far fast than the function code.内联编写的显式代码比函数代码运行得快得多。 Though a timer has been declared it is not a small difference but rather a question of one method running within 1 second and the other taking around 8 seconds (ymmv).虽然已经声明了计时器,但它不是一个小差异,而是一个方法在 1 秒内运行而另一个需要大约 8 秒 (ymmv) 的问题。

Functions calls will incur a penalty with overheads.函数调用将招致开销的惩罚。 Speed can be improved with changing optimisation flags but no where near to the point that both methods are running in a similar timeframe.速度可以通过更改优化标志来提高,但距离这两种方法在相似的时间范围内运行还差得很远。

My questions are:我的问题是:

  1. Can the function call method be altered, while still keeping as a function in a class, so that it runs at a similar, or the exact same, speed as the inline method.?可以更改函数调用方法,同时仍作为类中的函数保留,以便它以与内联方法相似或完全相同的速度运行。?
  2. Are there any compiler flags that can be used which will result in both methods running in similar timeframe?是否有任何可以使用的编译器标志会导致两种方法在相似的时间范围内运行? If so, how would they be used within Xcode and preferably, is there a recommended resource detailing usage.如果是这样,它们将如何在 Xcode 中使用,最好是有推荐的资源详细说明用法。

The below code is simplistic, and stylistically not very professional.下面的代码很简单,而且在风格上不是很专业。 This is to reduce clutter and hopefully focus on the main problem.这是为了减少混乱并希望专注于主要问题。

If there are any change to the class definition that can be made that will apply to the above questions, I am all ears.如果可以对适用于上述问题的类定义进行任何更改,我会全力以赴。 Otherwise, I understand your grievance, but it may not be relevant here.否则,我理解您的不满,但它可能与此处无关。

The the clock print out provided is not a thorough benchmarking tool.提供的时钟打印输出不是一个彻底的基准测试工具。 It is included as a basic illustration of the a large time difference between the two methods.它被包含作为两种方法之间大时间差的基本说明。

#include <iostream>
#include <sys/time.h>

class MyClass
{
    static const int I = 1000;
    int n, i;
    double foo[I] = {0};

public:
    void myFunction(int N)
    {
        for(n = 0; n < N; n++ )
        {
            for(i = 0; i < I; i++ )
            {
                foo[i]+= 1;
                foo[i]*= .5;
            }
        }
    }
};

int main()
{
    
    int N = 44.1e3 * 60; // number of samples
    
    //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    // The Explicit Approach
    //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    const int I = 1000;
    double bar[I] = {0};
    
    std::clock_t startTime = std::clock();
    
    for(int n = 0; n < N; n++)
    {
        for(int i = 0; i < I; i++)
        {
            bar[i] += 1;
            bar[i] *=.5;
        }
    }
    
    double duration = ( std::clock() - startTime ) / (double) CLOCKS_PER_SEC;
    
    std::cout << "Explicit Approach Time (seconds): " << duration << '\n';
    
    //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    //  The Class Approach
    //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    MyClass testClass;
    
    startTime = std::clock();
    
    testClass.myFunction(N);
    
    duration = (std::clock() - startTime) / (double) CLOCKS_PER_SEC;
    
    std::cout << "Class Approach Time (seconds): " << duration << '\n';
    
    return 0;
}

I slightly re-wrote your code like this:我稍微重新编写了您的代码,如下所示:

#include <iostream>
#include <chrono>
#include <cmath>

class someClass {
   int n, i;           // internal class loop indeces
   double foo[1000];
   int I = ::std::floor(sizeof(foo)/sizeof(foo[0])); // number of elements in foo
 public:
   void someFunction(int);
};

void someClass::someFunction(int N)
{
   for(n=N; n--; ) {
      for(i=I; i--; ) {
         foo[i] += 1;
         foo[i] *= .5;
      }
   }
}


int main()
{

    // this was initially an audio processing problem
    // Essentially, process a minute of audio

   int N = 44.1e3 * 60;

   {
      //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /// The Explicit Method
      //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      int n, i;             // loop indeces
      double bar[1000];
      int I = floor(sizeof(bar)/sizeof(bar[0]));

      // START CLOCK
      auto start = ::std::chrono::high_resolution_clock::now();


      // Exactly what is defined in the function 'someFunction' above
      for(n=N;n--; ){
         for(i=I;i--; ){
            bar[i]+= 1;
            bar[i]*=.5;
         }
      }

      // END CLOCK
      auto end = ::std::chrono::high_resolution_clock::now();

      std::chrono::duration<double> duration_secs = end - start;

      std::cout<<"Explicit Method Time (seconds): "<< duration_secs.count() <<'\n';
   }

   {
      //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      //  The Function Method
      //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      someClass testClass;

      // START CLOCK
      auto start = ::std::chrono::high_resolution_clock::now();

      testClass.someFunction(N);

      // END CLOCK
      auto end = ::std::chrono::high_resolution_clock::now();

      std::chrono::duration<double> duration_secs = end - start;
      std::cout<<"Member Function Time (seconds): "<< duration_secs.count() <<'\n';
   }

    return 0;
}

And I get this output now:我现在得到这个输出:

$ /tmp/a.out 
Explicit Method Time (seconds): 3.89e-07
Member Function Time (seconds): 1.46e-07

So, for me the explicit method is a lot slower.所以,对我来说,显式方法要慢得多。 Though they are still both so little time that it's barely worth measuring.尽管他们的时间都还很短,以至于几乎不值得衡量。 Both on the order of hundreds of nanoseconds.两者都在数百纳秒的数量级上。 It's impressive the timer is high resolution enough to measure it.令人印象深刻的是计时器的高分辨率足以测量它。

So, the next question is, which compiler are you using?那么,下一个问题是,您使用的是哪个编译器? What platform are you compiling for?你是为什么平台编译的?

There are still a lot of things about your code that are sub-optimal.您的代码仍有很多不理想的地方。 I just cleaned up the includes, used ::std::chrono from C++11 for timing, and made explicit scopes for timing your hand-inlined version and the member function version.我刚刚清理了包含,使用 C++11 中的::std::chrono进行计时,并为手动内联版本和成员函数版本设置了明确的范围。

I compiled with -O3 on gcc 7.2.我在 gcc 7.2 上用 -O3 编译。 It still contains undefined behavior since it uses an uninitialized array.它仍然包含未定义的行为,因为它使用了一个未初始化的数组。 Looking at the code, gcc realized the arrays were never even used and generated no code.查看代码,gcc 意识到数组甚至从未使用过,也没有生成任何代码。 So, basically, back-to-back calls to now .所以,基本上,背靠背调用now The time difference between the hand-inlined version and the member function can be completely attributed to basically roundoff error.手动内联版本和成员函数之间的时间差异可以完全归因于基本舍入误差。

So, the answer is, your code still doesn't show the problem you're talking about, and you still haven't given enough detail.所以,答案是,您的代码仍然没有显示您正在谈论的问题,并且您仍然没有提供足够的细节。 :-) Either that, or the answer lies in the compiler you're using and the options you're giving it. :-) 要么,要么答案在于您使用的编译器以及您提供的选项。

This question was previously asked by myself though not very well.以前我自己曾问过这个问题,虽然不是很好。 I have reworded and expanded on it and I hope I have clarified a couple of points.我已经对其进行了改写和扩展,希望我澄清了两点。 This question is very much from a beginners perspective.从初学者的角度来看,这个问题非常多。

Below is a short example code which runs inline and as part of a function.下面是一个简短的示例代码,该代码内联运行并作为函数的一部分运行。

The code adds 1 to each element of an array and then divides each element by two.代码将1加到数组的每个元素上,然后将每个元素除以2。 This is arbitrary for the moment and is just a place holder for something more complex but very much equivalent.目前这是任意的,只是占位符,可以用于更复杂但非常等效的事物。

As background this stems audio processing for real-time applications.作为背景,这会阻止实时应用程序的音频处理。 This is the reason for这就是原因

int N = 44.1e3 * 60;

So essentially this will be processing 1 minute of audio sampled at 44.1kHz.因此,从本质上讲,这将处理1分钟以44.1kHz采样的音频。

The explicit code written inline runs far fast than the function code.内联编写的显式代码比功能代码运行得快得多。 Though a timer has been declared it is not a small difference but rather a question of one method running within 1 second and the other taking around 8 seconds (ymmv).尽管已经声明了计时器,但这并不是一个小小的差异,而是一个方法的问题,一种方法在1秒钟内运行,另一种方法耗时约8秒钟(ymmv)。

Functions calls will incur a penalty with overheads.函数调用将产生开销的罚款。 Speed can be improved with changing optimisation flags but no where near to the point that both methods are running in a similar timeframe.可以通过更改优化标志来提高速度,但是没有什么办法可以使这两种方法都在相似的时间范围内运行。

My questions are:我的问题是:

  1. Can the function call method be altered, while still keeping as a function in a class, so that it runs at a similar, or the exact same, speed as the inline method.?是否可以更改函数调用方法,同时仍将其作为类保留在类中,以使其以与内联方法相似或完全相同的速度运行?
  2. Are there any compiler flags that can be used which will result in both methods running in similar timeframe?是否有任何可以使用的编译器标志会导致两种方法在相似的时间范围内运行? If so, how would they be used within Xcode and preferably, is there a recommended resource detailing usage.如果是这样,它们将如何在Xcode中使用,最好是建议使用资源详细说明用法。

The below code is simplistic, and stylistically not very professional.下面的代码很简单,从风格上讲不是很专业。 This is to reduce clutter and hopefully focus on the main problem.这是为了减少混乱,并希望将注意力集中在主要问题上。

If there are any change to the class definition that can be made that will apply to the above questions, I am all ears.如果可以对类定义进行任何更改以适用于上述问题,我非常高兴。 Otherwise, I understand your grievance, but it may not be relevant here.否则,我理解您的不满,但在这里可能与您无关。

The the clock print out provided is not a thorough benchmarking tool.提供的时钟输出不是一个全面的基准测试工具。 It is included as a basic illustration of the a large time difference between the two methods.它包括两种方法之间较大的时差的基本说明。

#include <iostream>
#include <sys/time.h>

class MyClass
{
    static const int I = 1000;
    int n, i;
    double foo[I] = {0};

public:
    void myFunction(int N)
    {
        for(n = 0; n < N; n++ )
        {
            for(i = 0; i < I; i++ )
            {
                foo[i]+= 1;
                foo[i]*= .5;
            }
        }
    }
};

int main()
{
    
    int N = 44.1e3 * 60; // number of samples
    
    //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    // The Explicit Approach
    //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    const int I = 1000;
    double bar[I] = {0};
    
    std::clock_t startTime = std::clock();
    
    for(int n = 0; n < N; n++)
    {
        for(int i = 0; i < I; i++)
        {
            bar[i] += 1;
            bar[i] *=.5;
        }
    }
    
    double duration = ( std::clock() - startTime ) / (double) CLOCKS_PER_SEC;
    
    std::cout << "Explicit Approach Time (seconds): " << duration << '\n';
    
    //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    //  The Class Approach
    //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    MyClass testClass;
    
    startTime = std::clock();
    
    testClass.myFunction(N);
    
    duration = (std::clock() - startTime) / (double) CLOCKS_PER_SEC;
    
    std::cout << "Class Approach Time (seconds): " << duration << '\n';
    
    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM