简体   繁体   English

如何优化 C++ 中的简单数字类型包装器 class?

[英]How to optimize a simple numeric type wrapper class in C++?

I am trying to implement a fixed-point class in C++, but I face problems with performance.我正在尝试在 C++ 中实现定点 class,但我遇到了性能问题。 I have reduced the problem to a simple wrapper of the float type and it is still slow.我已将问题简化为浮点类型的简单包装器,但它仍然很慢。 My question is - why is the compiler unable optimize it fully?我的问题是 - 为什么编译器无法完全优化它?

The 'float' version is 50% faster than 'Float'. “浮动”版本比“浮动”版本快 50%。 Why?!为什么?!

(I use Visual C++ 2008, all possible compiler's options tested, Release configuration of course). (我使用 Visual C++ 2008,所有可能的编译器选项都经过测试,当然是发布配置)。

See the code below:请看下面的代码:

#include <cstdio>
#include <cstdlib>
#include "Clock.h"      // just for measuring time

#define real Float      // Option 1
//#define real float        // Option 2

struct Float
{
private:
    float value;

public:
    Float(float value) : value(value) {}
    operator float() { return value; }

    Float& operator=(const Float& rhs)
    {
        value = rhs.value;
        return *this;
    }

    Float operator+ (const Float& rhs) const
    {
        return Float( value + rhs.value );
    }

    Float operator- (const Float& rhs) const
    {
        return Float( value - rhs.value );
    }

    Float operator* (const Float& rhs) const
    {
        return Float( value * rhs.value );
    }

    bool operator< (const Float& rhs) const
    {
        return value < rhs.value;
    }
};

struct Point
{
    Point() : x(0), y(0) {}
    Point(real x, real y) : x(x), y(y) {}

    real x;
    real y;
};

int main()
{
    // Generate data
    const int N = 30000;
    Point points[N];
    for (int i = 0; i < N; ++i)
    {
        points[i].x = (real)(640.0f * rand() / RAND_MAX);
        points[i].y = (real)(640.0f * rand() / RAND_MAX);
    }

    real limit( 20 * 20 );

    // Check how many pairs of points are closer than 20
    Clock clk;

    int count = 0;
    for (int i = 0; i < N; ++i)
    {
        for (int j = i + 1; j < N; ++j)
        {
            real dx = points[i].x - points[j].x;
            real dy = points[i].y - points[j].y;
            real d2 = dx * dx + dy * dy;
            if ( d2 < limit )
            {
                count++;
            }
        }
    }

    double time = clk.time();

    printf("%d\n", count);
    printf("TIME: %lf\n", time);

    return 0;
}

IMO, It has to do with optimization flags . IMO,它与优化标志有关。 I checked your program in g++ linux-64 machine.我在 g++ linux-64 机器上检查了你的程序。 Without any optimization, it give the same result as you told which 50% less.没有任何优化,它给出的结果与您所说的减少50%相同。

With keeping the maximum optimization turned ON (ie -O4 ).保持最大优化开启(即-O4 )。 Both versions are same.两个版本都是一样的。 Turn on the optimization and check.打开优化并检查。

Try not passing by reference.尽量不要通过引用传递。 Your class is small enough that the overhead of passing it by reference (yes there is overhead if the compiler doesn't optimize it out), might be higher than just copying the class.您的 class 足够小,以至于通过引用传递它的开销(是的,如果编译器不对其进行优化,则会产生开销),可能高于仅复制 class。 So this...所以这...

Float operator+ (const Float& rhs) const
{
   return Float( value + rhs.value );
}

becomes something like this...变成这样……

Float operator+ (Float rhs) const
{
   rhs.value+=value;
   return rhs;
}

which avoids a temporary object and may avoid some indirection of a pointer dereference.这避免了临时 object 并且可以避免指针取消引用的某些间接。

After further investigation I am thoroughly convinced this is an issue with the optimization pipeline of the compiler.经过进一步调查,我完全确信这是编译器优化管道的问题。 The code generated in this instance is significantly bad in comparison to using a non-encapsulated float .与使用非封装float相比,在这种情况下生成的代码非常糟糕 My suggestion is to report this potential issue to Microsoft and see what they have to say about it.我的建议是将这个潜在问题报告给微软,看看他们对此有何看法。 I also suggest that you move on to implementing your planned fixed point version of this class as the code generated for integers appears optimal.我还建议您继续实施此 class 的计划定点版本,因为为整数生成的代码似乎是最佳的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM