[英]GCC can't vectorize this simple loop ('number of iterations cannot be computed') yet managed a similar one in the same code?
So, I have C++ code with this loop:所以,我有这个循环的 C++ 代码:
for(i=0;i<(m-1);i++) N4[i]=(i+m-1-Rigta[i]-1-N3[i])/N0;
All the quantitiy involved are int
's.所有涉及的数量都是
int
的。 From GCC's vectorization report I get:从 GCC 的矢量化报告中我得到:
babar.cpp:233: note: ===== analyze_loop_nest =====
babar.cpp:233: note: === vect_analyze_loop_form ===
babar.cpp:233: note: === get_loop_niters ===
babar.cpp:233: note: not vectorized: number of iterations cannot be computed.
babar.cpp:233: note: bad loop form.
I wondering why 'the number of iteration cannot be computed'!?我想知道为什么“无法计算迭代次数”!? FWIW,
m
is declared as const int& m
. FWIW,
m
被声明为const int& m
。 What makes this even more puzzling is that just above in the same code I have:让这更令人费解的是,在我拥有的相同代码中:
for(i=1;i<(m-1);i++) a2[i]=(x[i]+x[i+m-1])*0.5f;
and the loop above gets vectorized just fine (here a2
and x
are floats
).并且上面的循环被矢量化得很好(这里
a2
和x
是floats
)。 I'm compiling with the我正在编译
-Ofast -ftree-vectorizer-verbose=10 -mtune=native -march=native
flags on GCC 4.8.1 on a i7. i7 上 GCC 4.8.1 上的标志。
Thanks in advance,提前致谢,
After @nodakai idea, I tried this:在@nodakai 的想法之后,我尝试了这个:
const int mm = m;
for(i=0;i<(m-1);i++) N4[i]=(i+m-1-Rigta[i]-1-N3[i])/N0;
this didn't get me quiet there:这并没有让我安静:
babar.cpp:234: note: not vectorized: relevant stmt not supported: D.55255_812 = D.55254_811 / N0_34;
babar.cpp:234: note: bad operation or unsupported loop bound.
so of course, I tried:所以当然,我试过:
const int mm=m;
const float G0=1.0f/(float)N0;
for(i=0;i<(mm-1);i++) N4[i]=(i+mm-1-Rigta[i]-1-N3[i])*G0;
which then produced:然后产生:
babar.cpp:235: note: LOOP VECTORIZED.
(eg success). (例如成功)。 Oddly enough, the
mm
seems necessary(?!).奇怪的是,
mm
似乎是必要的(?!)。
Can you try these two steps and see if there's any differences?你可以试试这两个步骤,看看有什么不同吗?
const int mm = m;
const int mm = m;
just before the loop.m
with mm
.mm
替换所有出现的m
。Your loop bounds probably do not divide by the vectorization factor.您的循环边界可能不会除以矢量化因子。 Note that in the loop that vectorizes, the loop iterates for one less time than the one that does not.
请注意,在进行矢量化的循环中,该循环比不进行矢量化的循环少迭代一次。 As a simple test to see if this is the case, you can change the starting point of your non-vectorized loop to
1
and then do the 0
case prior to the loop, like:作为查看是否是这种情况的简单测试,您可以将非矢量化循环的起点更改为
1
,然后在循环之前执行0
案例,例如:
N4[0] = (m - 1 - Rigta[0] - 1 - N3[0]) / N0;
for(i=1; i<(m-1); i++) {
N4[i]=(i + m - 1 - Rigta[i] - 1 - N3[i])/N0;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.