简体   繁体   English

Fortran 在 for 循环内矢量化日志

[英]Fortran vectorize log inside a for loop

Here is the minimum working code.这是最低工作代码。

program test
    implicit none

    double precision:: c1,c2,rate
    integer::ci,cj,cr,cm,i
    integer,parameter::max_iter=10000000 !10^7

    c1=0.0d+0

    CALL system_clock(count_rate=cr)
    CALL system_clock(count_max=cm)
    rate = REAL(cr)

    CALL SYSTEM_CLOCK(ci)
    do i=1,max_iter
        c1=c1+log(DBLE(i))
    end do
    CALL SYSTEM_CLOCK(cj)
    WRITE(*,*) "system_clock : ",(cj - ci)/rate

    print*, c1
end program test

When I compile with gfortran -Ofast -march=core-avx2 -fopt-info-vec-optimized the for loop with the log function does not get vectorized.当我使用gfortran -Ofast -march=core-avx2 -fopt-info-vec-optimized编译时,带有日志 function 的 for 循环没有得到矢量化。 I have also tried with -O3 but the result does not change.我也尝试过-O3但结果没有改变。

But if I write the equivalent C++ code,但是如果我写等效的 C++ 代码,

#include <iostream>
#include <chrono>
#include <cmath>
using namespace std;
using namespace std::chrono;

int main()
{
    double c1=0;
    const int max_iter=10000000; // 10^7

    auto start = high_resolution_clock::now();
    for(int i=1;i<=max_iter;i++)
    {
        c1 += log(i);
    }
    auto stop = high_resolution_clock::now();

    auto duration = duration_cast<milliseconds>(stop - start); 
    cout << duration.count() << " ms"<<'\n'; 

    printf("%0.15f\n",c1);

    return 0;
}

and compile it with g++ -Ofast -march=core-avx2 -fopt-info-vec-optimized , the for loop gets vectorized and runs almost 10 times faster.并使用g++ -Ofast -march=core-avx2 -fopt-info-vec-optimized编译它,for 循环被矢量化并且运行速度快了近 10 倍。

What should I do to make the fortran loop vectorized?我应该怎么做才能使 fortran 循环矢量化?

The problem with vectorizing loops that include the math functions (like log ) is that the compiler has to be taught the semantics of the vectorized math functions (and you see if you look at the assembler output that the Fortran version calls the "normal" scalar function (a line like call log ) whereas your C++ version calls the vectorized version ( call _ZGVdN4v___log_finite )).包含数学函数(如log )的向量化循环的问题在于,必须向编译器教授向量化数学函数的语义(如果您查看汇编程序 output ,Z843E353F7A5A6842B926BF213DFE86Z 版本调用“正常”标量function (类似于call log的一行),而您的 C++ 版本调用矢量化版本( call _ZGVdN4v___log_finite ))。 There has been some work wrt making GFortran understand the glibc vector math library ( libmvec ), but I'm not sure what the current status is.有一些工作让 GFortran 理解 glibc 矢量数学库 ( libmvec ),但我不确定当前状态是什么。 See the thread starting at https://gcc.gnu.org/legacy-ml/gcc/2018-04/msg00062.html and continuing in June 2018 starting at https://gcc.gnu.org/legacy-ml/gcc/2018-06/msg00167.html for more details. See the thread starting at https://gcc.gnu.org/legacy-ml/gcc/2018-04/msg00062.html and continuing in June 2018 starting at https://gcc.gnu.org/legacy-ml/gcc /2018-06/msg00167.html了解更多详情。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM