如何使gcc向量化此循环

Question

I have this loop where b2 is a float , x1 is a (Eigen c++) vector of float , a1 and a0 are int . 我有此回路，其中b2是一个float ， x1是（本征C ++）的矢量float ， a1和a0是int 。

for(int i=1;i<9;i++)
    b2+=a0*(float)0.5*(std::log(fabs(x1(a1+a0*(i-1))))+std::log(fabs(x1(a1+a0*i))));

GCC returns: GCC返回：

analyze_innermost: failed: evolution of base is not affine.

I was wondering if there was a simple way to rewrite the loop to allow GCC to vectorize it (I'm compiling with all the unsafe options enabled...I'm doing this to learn). 我想知道是否有一种简单的方法来重写循环以允许GCC对它进行矢量化（我正在使用所有启用的不安全选项进行编译...我正在这样做以进行学习）。

Edit: 编辑：

x1 is an eigen construct. x1是本征结构。 I'm using GCC 4.8.1 with O3 flag. 我正在使用带有O3标志的GCC 4.8.1。

Answer 1

Your example cannot be easily vectorized because you're not accessing the entries of x1 in a sequential manner. 您的示例无法轻易地向量化，因为您没有以顺序的方式访问x1的条目。

With sequential access, it could be vectorized like that: 通过顺序访问，可以将其向量化为：

ArrayXf x1;
b2 = (x1.segment(i,9).abs().log() + x1.segment(j,9).abs().log()).sum() * a0;

Answer 2

I would break this up into 3 loops: 我将其分为3个循环：

float t1[9];
float t2[9];

for (i = 0; i < 9; ++i)                // (1) - gather input terms
    t1[i] = x1(a1+a0*i);

for (i = 0; i < 9; ++i)                // (2) - do expensive log/fabs operations
    t2[i] = std::log(fabs(t1[i]));     //       with minimum redundancy

for (i = 1; i < 9; ++i)                // (3) - wrap it all up
    b2 += a0*0.5f*(t2[i-1] + t2[i]);

I suspect that (1) may not be vectorizable (unless you have AVX2 with gathered loads), but (2) and (3) have a reasonable chance. 我怀疑（1）可能无法向量化（除非您的AVX2具有聚集的负载），但是（2）和（3）有合理的机会。

如何使gcc向量化此循环

问题描述

Edit: 编辑：

2 个解决方案

解决方案1
1 2014-03-21 16:00:08

解决方案2
1 已采纳 2014-03-21 23:08:42

如何使gcc向量化此循环

问题描述

Edit: 编辑：

2 个解决方案

解决方案1 1 2014-03-21 16:00:08

解决方案2 1 已采纳 2014-03-21 23:08:42

解决方案1
1 2014-03-21 16:00:08

解决方案2
1 已采纳 2014-03-21 23:08:42