[英]Why is the kiss_fft's forward and inverse radix-4 calculation different, part 2?
Part 1 - why the code below checks st_inverse in the first place 第 1 部分 - 为什么下面的代码首先检查 st_inverse
The kiss_fft code has this branch inside a loop : Kiss_fft 代码在循环中有这个分支:
do {
if(st->inverse) {
Fout[m].r = scratch[5].r - scratch[4].i;
Fout[m].i = scratch[5].i + scratch[4].r;
Fout[m3].r = scratch[5].r + scratch[4].i;
Fout[m3].i = scratch[5].i - scratch[4].r;
}else{
Fout[m].r = scratch[5].r + scratch[4].i;
Fout[m].i = scratch[5].i - scratch[4].r;
Fout[m3].r = scratch[5].r - scratch[4].i;
Fout[m3].i = scratch[5].i + scratch[4].r;
}
++Fout;
} while (--k); // Fout[] has k*4 elements.
Slightly reordered:稍微重新排序:
if(st->inverse) {
Fout[m].r = scratch[5].r - scratch[4].i;
Fout[m].i = scratch[5].i + scratch[4].r;
Fout[m3].r = scratch[5].r + scratch[4].i;
Fout[m3].i = scratch[5].i - scratch[4].r;
}else{
Fout[m3].r = scratch[5].r - scratch[4].i;
Fout[m3].i = scratch[5].i + scratch[4].r
Fout[m].r = scratch[5].r + scratch[4].i;
Fout[m].i = scratch[5].i - scratch[4].r;;
}
The two code blocks really differ only in their use of m
and m3
.这两个代码块的真正区别仅在于它们对
m
和m3
。 But m
and m3
are not changed inside the loop.但是
m
和m3
在循环内没有改变。 Can I simply eliminate this inner-loop branch by swapping m
and m3
?我可以通过交换
m
和m3
简单地消除这个内循环分支吗?
if(st->inverse) { swap(&m, &m3); }
do {
Fout[m].r = scratch[5].r - scratch[4].i;
Fout[m].i = scratch[5].i + scratch[4].r;
Fout[m3].r = scratch[5].r + scratch[4].i;
Fout[m3].i = scratch[5].i - scratch[4].r;
++Fout;
} while (--k);
I can indeed use that optimization.我确实可以使用这种优化。 It's not necessary however with current-gen compilers that can use AVX.
但是,对于可以使用 AVX 的当前一代编译器,这不是必需的。 They'll eliminate that branch as well, using
vpcmpeqd
and vblendvps
.他们也会使用
vpcmpeqd
和vblendvps
消除该分支。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.