简体   繁体   English

为什么GCC无法自动向量化此循环?

[英]Why does GCC not auto-vectorize this loop?

I am attempting to optimize a loop that accounts for a lot of my program's computation time. 我正在尝试优化一个占我程序很多计算时间的循环。

But when I turn on auto-vectorization with -O3 -ffast-math -ftree-vectorizer-verbose=6 GCC outputs that it can not vectorize the loop. 但是,当我使用-O3 -ffast-math -ftree-vectorizer-verbose = 6 GCC输出打开自动矢量化时,它无法对循环进行矢量化。

I am using GCC 4.4.5 我正在使用GCC 4.4.5

The code: 代码:

/// Find the point in the path with the largest v parameter
void prediction::find_knife_edge(
    const float * __restrict__ const elevation_path,
    float * __restrict__ const diff_path,
    const float path_res,
    const unsigned a,
    const unsigned b,
    const float h_a,
    const float h_b,
    const float f,
    const float r_e,
) const
{
    float wavelength = (speed_of_light * 1e-6f) / f;

    float d_ab = path_res * static_cast<float>(b - a);

    for (unsigned n = a + 1; n <= b - 1; n++)
    {
        float d_an = path_res * static_cast<float>(n - a);
        float d_nb = path_res * static_cast<float>(b - n);

        float h = elevation_path[n] + (d_an * d_nb) / (2.0f * r_e) - (h_a * d_nb + h_b * d_an) / d_ab;
        float v = h * std::sqrt((2.0f * d_ab) / (wavelength * d_an * d_nb));

        diff_path[n] = v;
    }
}

The messages from GCC: 来自GCC的消息:

note: not vectorized: number of iterations cannot be computed.
note: not vectorized: unhandled data-ref 

On the page about auto-vectorization ( http://gcc.gnu.org/projects/tree-ssa/vectorization.html ) it states that it supports unknown loop bounds. 在有关自动向量化的页面( http://gcc.gnu.org/projects/tree-ssa/vectorization.html )上,它声明它支持未知的循环边界。

If I replace the for with 如果我将替换为

for (unsigned n = 0; n <= 100; n++)

then it vectorizes it. 然后将其向量化。

What am I doing wrong? 我究竟做错了什么?

The lack of detailed documentation on exactly what these messages mean and the ins/outs of GCC auto-vectorization is rather annoying. 缺少有关这些消息确切含义的详细文档以及GCC自动矢量化的来龙去脉非常令人讨厌。

EDIT: 编辑:

Thanks to David I changed the loop to this: 感谢David,我将循环更改为:

 for (unsigned n = a + 1; n < b; n++)

Now GCC attempts to vectorize the loop but throws out this error: 现在,GCC尝试对循环进行矢量化处理,但抛出以下错误:

 note: not vectorized: unhandled data-ref
 note: Alignment of access forced using peeling.
 note: Vectorizing an unaligned access.
 note: vect_model_induction_cost: inside_cost = 1, outside_cost = 2 .
 note: not vectorized: relevant stmt not supported: D.76777_65 = (float) n_34;

What does "D.76777_65 = (float) n_34;" “ D.76777_65 =(float)n_34;”是什么? mean? 意思?

I may have slightly botched the details, but this is the way you need to restructure your loop to get it to vectorize. 我可能会稍微降低细节,但这是您需要重组循环以使其向量化的方式。 The trick is to precompute the number of iterations and iterate from 0 to one short of that number. 诀窍是预先计算迭代次数,并从0迭代到该数目的短数。 Do not change the for statement. 不要更改for语句。 You may need to fix the two lines before it and the two lines at the top of the loop. 您可能需要先修复两行,然后再修复循环顶部的两行。 They're approximately right. 他们大概是对的。 ;) ;)

const unsigned it=(b-a)-1;
const unsigned diff=b-a;
for (unsigned n = 0; n < it; n++)
{
    float d_an = path_res * static_cast<float>(n);
    float d_nb = path_res * static_cast<float>(diff - n);

    float h = elevation_path[n] + (d_an * d_nb) / (2.0f * r_e) - (h_a * d_nb + h_b * d_an) / d_ab;
    float v = h * sqrt((2.0f * d_ab) / (wavelength * d_an * d_nb));

    diff_path[n] = v;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM