简体   繁体   中英

Why gcc's output much slower than Visual Studio's (for this code)?

When I compile the followning code on a regularly updated Ubuntu 16.04 64bit using gcc by

gcc source.c -O3 --fast-math

the executable file takes about 45 seconds of CPU time to run. But on the same machine and in Windows 7 64bit, using Visual Studio 2012 in release mode, it takes less than 10 seconds of CPU time to run. What is the main cause of this difference? Haven't I used enough optimization options of gcc? Is Visual Studio's compiler a better one? Or something else?

#include <stdio.h>
#include <math.h>
#include <time.h>

#define Nx 1000

int main()
{
    double d = 0.015e-2;        // meter
    double V0 = 400;            // volt
    double De = 1800e-4;        // m^2 per sec
    double mu_e = 2.9e1 / 760;  // m^2 per volt sec
    double n0 = 1e19;           // per m^3
    double e_eps = 1.602e-19 / 8.854e-12;
    double ne[Nx], je[Nx], E[Nx];
    double dx = d / (Nx - 1);
    double dt = 1e-14;          // s
    const int Nt = 500000;
    int i, k;
    double sum;
    FILE *fp_ne, *fp_E;
    double alpha, exp_alpha, R;
    int ESign = -1;
    clock_t start_t, end_t;

    start_t = clock();
    // initialization
    for (i = 1; i < Nx; i++)
        ne[i] = n0;
    ne[0] = 1e-4 * n0;

    for (i = 0; i < Nx; i++)
        E[i] = -V0 / d;

    // time loop
    for (k = 0; k < Nt; k++)
    {
        if (k%1000==0) printf("k = %d\n", k);
        for (i = 0; i < (Nx-1); i++)
        {
            alpha = mu_e*dx*E[i]/De;
            exp_alpha = exp(alpha);
            R = (exp_alpha-1)/alpha;
            je[i] = (De/(dx*R))*(ne[i]-exp_alpha*ne[i+1]);
        }

        for (i = 1; i < (Nx - 1); i++)
            ne[i] += -dt/dx*(je[i] - je[i-1]);
        ne[Nx - 1] = ne[Nx - 2];

        sum = 0;
        for (i = 0; i < (Nx - 1); i++)
            sum += dx*je[i];
        for (i = 0; i < (Nx - 1); i++)
        {
            E[i] += -dt*e_eps*(sum / d - je[i]);
            if (E[i]>=0) ESign=+1;
        }
        if (ESign==1) break;
    }

    // output
    printf("time=%e\n",k*dt);
    fp_ne = fopen("ne.txt", "w");
    fp_E = fopen("E.txt", "w");
    fprintf(fp_ne, "# x (cm)\tne(per cm^3)\n");
    fprintf(fp_E,  "# x (cm)\tE(V/cm)\n");
    for (i = 0; i < Nx; i++)
        fprintf(fp_ne, "%f\t%e\n", i*dx*100,ne[i]/1e6);
    for (i = 0; i < Nx-1; i++)
        fprintf(fp_E, "%f\t%e\n", i*dx*100, fabs(E[i])/1e2);
    fclose(fp_ne);
    fclose(fp_E);
    end_t = clock();
    printf("CPU time = %f\n", (double)(end_t - start_t) / CLOCKS_PER_SEC);
}

First thing I did was comment out the in-loop I/O.

//if (k%1000==0) printf("k = %d\n", k);

I obtained the below timings with only that change. The fprintf calls at the end do influence the timings significantly, but not their relative differences, so I'm not going to measure all of these again.

I got these timings on my Arch Linux first-gen Core i5 (all compiled with the standard -O2 ):

  • GCC 7.1:

     CPU time = 23.459520 
  • Clang 4.0.1:

     CPU time = 22.936315 
  • Intel 17.0.4:

     CPU time = 7.830828 

On my Qemu/libvirt virtual machine of Windows 10 on that same machine I get these timings:

  • MinGW-w64 GCC 6.3:

     CPU time = 76.122000 
  • VS 2015.3:

     CPU time = 13.497000 
  • VS 2017:

     CPU time = 49.306000 

On WINE (native Linux, but Win32 API emulation, should still be comparable to native Linux code execution)

  • MinGW-w64 GCC 6.3:

     CPU time = 56.074000 
  • VS 2015.3:

     CPU time = 12.048000 
  • VS 2017:

     CPU time = 34.541000 

Long story short: it seems like these output the best code for this particular problem:

  1. Intel on Linux (probably also on Windows)
  2. VS 2015.3
  3. GCC/Clang on Linux
  4. VS 2017
  5. MinGW-w64 GCC.

Looking at the assembly will be the only way to get to the bottom of this, but properly analysing that is beyond me.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM