简体   繁体   English

x86和x86_64中的float和double之间的性能差异

[英]Performance difference between float and double in x86 and x86_64

A while ago I heard that some compilers use SSE2 extensions for floating point operations for x86_64 architecture, so I used this simple code to determine the performance difference between them. 不久前,我听说有些编译器将SSE2扩展用于x86_64体系结构的浮点运算,因此我使用此简单代码确定它们之间的性能差异。

I disabled Intel SpeedStep technology via BIOS and system load was approximately equal for my tests. 我通过BIOS禁用了Intel SpeedStep技术,并且系统负载与我的测试大致相同。 I am using GCC 4.8 on OpenSuSE 64 bit. 我在OpenSuSE 64位上使用GCC 4.8。

I am writing a program with a lot of FPU operations and I would like to know if this test is valid or not? 我正在编写一个具有很多FPU操作的程序,我想知道此测试是否有效?

And any information about the performance difference between float and double under each architecture is appreciated. 在每种体系结构下,有关floatdouble float之间的性能差异的任何信息都将受到赞赏。

Code : 代码:

#include <iostream>
#include <sys/time.h>                
#include <vector>
#include <cstdlib>

using namespace std;

int main()
{
    timeval t1, t2;
    double elapsedTime;

    double TotalTime = 0;


    for(int j=0 ; j < 100 ; j++)
    {
        // start timer
        gettimeofday(&t1, NULL);

        vector<float> RealVec;
        float temp;

        for (int i = 0; i < 1000000; i++)
        {
            temp = static_cast <float> (rand()) / (static_cast <float> (RAND_MAX));
            RealVec.push_back(temp);
        }

        for (int i = 0; i < 1000000; i++)
            {
                RealVec[i] = (RealVec[i]*2-435.345345)/15.75;
            }

        // stop timer
        gettimeofday(&t2, NULL);
        elapsedTime = (t2.tv_sec - t1.tv_sec) * 1000.0;      // sec to ms
        elapsedTime += (t2.tv_usec - t1.tv_usec) / 1000.0;   // us to ms

        TotalTime = TotalTime + elapsedTime;
    }


    cout << TotalTime/100 << " ms.\n";

    return 0;
}

and result : 结果:

32 Bit Double 32位双

157.781 ms. 157.781毫秒 151.994 ms. 151.994毫秒 152.244 ms. 152.244毫秒

32 Bit Float 32位浮点

149.896 ms. 149.896毫秒 148.489 ms. 148.489毫秒 161.086 ms. 161.086毫秒

64 Bit Double 64位双

110.125 ms. 110.125毫秒 111.612 ms. 111.612毫秒 113.818 ms. 113.818毫秒

64 Bit Float 64位浮点

110.393 ms. 110.393毫秒 106.778 ms. 106.778毫秒 107.833 ms. 107.833毫秒

Not really valid. 不太有效。 You're basically testing the performance of the random number generator. 您基本上是在测试随机数生成器的性能。

Also, you're not trying to enforce SSE2 SIMD operation, so you can't really claim this compares anything SSE-related. 另外,您并不是要强制执行SSE2 SIMD操作,因此您不能真正声称这可以与SSE相关的任何事物进行比较。

Valid in what sense? 在什么意义上有效?

Measure actual usage, with your actual code. 使用您的实际代码来测量实际使用情况。

Some artificial test suite probably won't help you assess the performance characteristics. 一些人工测试套件可能无法帮助您评估性能特征。

You can use a typedef , then change the actual underlying type with a flick of a switch. 您可以使用typedef ,然后轻按一下开关即可更改实际的基础类型。

You're really not measuring much; 你真的没有测量多少; perhaps just the degree of compiler optimization. 也许仅仅是编译器优化的程度。 In order for the measurements to be valid, you really have to do something with the results, or the compiler can optimize out all, or the major part of your tests. 为了使测量有效,您实际上必须对结果做些事情,否则编译器可以优化全部或大部分测试。 What I woule do is 1) initialize the vector, 2) get the start time (probably using clock , since that only takes CPU time into account), 3) execute the second loop a 100 (or more... enough to last a couple of seconds, at least) times, 4) get the end time, and finally, 5) output the sum of the elements in the vector. 我要做的是1)初始化向量,2)获取开始时间(可能使用clock ,因为这只考虑了CPU时间),3)执行第二个循环100(或更多...足以持续一个至少几秒钟),4)得到结束时间,最后5)输出向量中元素的总和。

With regards to the differences you may find: independently of the floating point processors, the 64 bit machine has more general registers for the compiler to play with. 关于差异,您可能会发现:64位计算机与浮点处理器无关,具有更多通用寄存器供编译器使用。 This could have an enormous impact. 这可能会产生巨大的影响。 Unless you look at the generated assembler, you just can't know. 除非您查看生成的汇编程序,否则您将不知道。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM