简体   繁体   English

GCC 无法矢量化这个简单的循环(“无法计算迭代次数”)却在相同的代码中管理了一个类似的循环?

[英]GCC can't vectorize this simple loop ('number of iterations cannot be computed') yet managed a similar one in the same code?

So, I have C++ code with this loop:所以,我有这个循环的 C++ 代码:

for(i=0;i<(m-1);i++)    N4[i]=(i+m-1-Rigta[i]-1-N3[i])/N0;

All the quantitiy involved are int 's.所有涉及的数量都是int的。 From GCC's vectorization report I get:从 GCC 的矢量化报告中我得到:

babar.cpp:233: note: ===== analyze_loop_nest =====
babar.cpp:233: note: === vect_analyze_loop_form ===
babar.cpp:233: note: === get_loop_niters ===
babar.cpp:233: note: not vectorized: number of iterations cannot be computed.
babar.cpp:233: note: bad loop form.

I wondering why 'the number of iteration cannot be computed'!?我想知道为什么“无法计算迭代次数”!? FWIW, m is declared as const int& m . FWIW, m被声明为const int& m What makes this even more puzzling is that just above in the same code I have:让这更令人费解的是,在我拥有的相同代码中:

for(i=1;i<(m-1);i++)    a2[i]=(x[i]+x[i+m-1])*0.5f;

and the loop above gets vectorized just fine (here a2 and x are floats ).并且上面的循环被矢量化得很好(这里a2xfloats )。 I'm compiling with the我正在编译

-Ofast -ftree-vectorizer-verbose=10 -mtune=native -march=native

flags on GCC 4.8.1 on a i7. i7 上 GCC 4.8.1 上的标志。

Thanks in advance,提前致谢,

Edit:编辑:

After @nodakai idea, I tried this:在@nodakai 的想法之后,我尝试了这个:

const int mm = m;
for(i=0;i<(m-1);i++)    N4[i]=(i+m-1-Rigta[i]-1-N3[i])/N0;

this didn't get me quiet there:这并没有让我安静:

babar.cpp:234: note: not vectorized: relevant stmt not supported: D.55255_812 = D.55254_811 / N0_34;
babar.cpp:234: note: bad operation or unsupported loop bound.

so of course, I tried:所以当然,我试过:

const int mm=m;
const float G0=1.0f/(float)N0;
for(i=0;i<(mm-1);i++)   N4[i]=(i+mm-1-Rigta[i]-1-N3[i])*G0;

which then produced:然后产生:

babar.cpp:235: note: LOOP VECTORIZED.

(eg success). (例如成功)。 Oddly enough, the mm seems necessary(?!).奇怪的是, mm似乎是必要的(?!)。

Can you try these two steps and see if there's any differences?你可以试试这两个步骤,看看有什么不同吗?

  1. insert const int mm = m;插入const int mm = m; just before the loop.就在循环之前。
  2. replace all the occurences of m with mm .mm替换所有出现的m

Your loop bounds probably do not divide by the vectorization factor.您的循环边界可能不会除以矢量化因子。 Note that in the loop that vectorizes, the loop iterates for one less time than the one that does not.请注意,在进行矢量化的循环中,该循环比不进行矢量化的循环少迭代一次。 As a simple test to see if this is the case, you can change the starting point of your non-vectorized loop to 1 and then do the 0 case prior to the loop, like:作为查看是否是这种情况的简单测试,您可以将非矢量化循环的起点更改为1 ,然后在循环之前执行0案例,例如:

N4[0] = (m - 1 - Rigta[0] - 1 - N3[0]) / N0;
for(i=1; i<(m-1); i++) {
    N4[i]=(i + m - 1 - Rigta[i] - 1 - N3[i])/N0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么GCC不能矢量化这个函数并循环? - Why can GCC not vectorize this function and loop? 英特尔编译器无法矢量化这个简单的循环? - Intel compiler cannot vectorize this simple loop? 如何让 gcc 完全矢量化这个 sqrt 循环? - How can you get gcc to fully vectorize this sqrt loop? 如果第二个参数在其中一次迭代中发生变化,“for”循环的迭代次数是否会发生变化? - Does the number of iterations of the 'for' loop change if the second argument changes in one of the iterations? 如何使gcc向量化此循环 - how to enable gcc to vectorize this loop 为什么GCC无法自动向量化此循环? - Why does GCC not auto-vectorize this loop? C ++:while循环用于大量迭代的问题(已附加代码) - C++: problem with while loop for large number of iterations (code is attached) 优化大量迭代的代码 - Optimize code for large number of iterations 同一段代码,可以在xcode中编译,也可以在终端中使用g ++,clang ++进行编译,而不能使用gcc或clang进行编译。 为什么? - The same piece of code, can compile in xcode or using g++, clang++ in terminal, cannot compile using gcc or clang. Why? 为什么循环中经过的时间与迭代次数不成比例 - Why is the elapsed time in a loop not proportional to the number of iterations
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM