[英]Efficiency using bitwise operators
The requirement is like following: 要求如下:
/* length must be >= 18 */
int calcActualLength(int length) {
int remainder = (length - 18) % 8;
if (remainder == 0)
return length;
return length + 8 - remainder;
}
using bit-wise operator, I could refactor the 1st line 使用逐位运算符,我可以重构第一行
int remainder = (length - 2) & 7;
Can it be further optimized? 可以进一步优化吗?
((length+5)&~7)+2
int calcActualLength(int length) {
int remainder = (length - 18) % 8;
if (remainder == 0)
return length;
return length + 8 - remainder;
}
==>
int HELPER_calcActualLength(int length) {
int remainder = length % 8;
if (remainder == 0)
return length;
return length + 8 - remainder;
}
int calcActualLength(int length) {
return 18 + HELPER_calcActualLength(length - 18);
}
And HELPER_calcActualLength()
equals to ROUNDUP_8()
in the semantics when the argument >= 0 当参数> = 0时, HELPER_calcActualLength()
等于语义中的ROUNDUP_8()
And more simpler ROUNDUP_8() can be: 更简单的ROUNDUP_8()可以是:
#define ROUNDUP_8(x) (((x)+7)&~7)
int calcActualLength(int length) {
return 18 + ROUNDUP_8(length - 18);
}
==> 2 + ROUNDUP_8(length - 18 + 16);
==> 2 + ROUNDUP_8(length - 2);
==> 2 + (((length - 2)+7)&~7)
==> ((length+5)&~7)+2
Original code produces the following 64-bit assembly when compiling with gcc -O3
: 使用gcc -O3
编译时,原始代码生成以下64位程序集:
movl %edi, %eax
leal -18(%rax), %ecx
movl %ecx, %edx
sarl $31, %edx
shrl $29, %edx
addl %edx, %ecx
andl $7, %ecx
subl %edx, %ecx
je .L2
addl $8, %eax
subl %ecx, %eax
.L2:
rep
As suggested in the comments to your question, changing the argument to unsigned int
allows for greater optimisations and results in the following assembly: 正如您对问题的评论中所建议的那样,将参数更改为unsigned int
可以实现更大的优化并导致以下程序集:
leal -18(%rdi), %edx
movl %edi, %eax
andl $7, %edx
je .L3
leal 8(%rdi), %eax
subl %edx, %eax
.L3:
rep
Rounding up to a multiple of 8
can be performed by adding 7
and masking with ~7
. 通过添加7
并使用~7
进行掩蔽,可以执行向上舍入为8
的倍数。 It works like this: if the last three bits are not all zero, then adding 7
carries into the 4-th bit, otherwise no carry occurs. 它的工作方式如下:如果最后三位不是全为零,则在第4位加上7
进位,否则不进行进位。 So your function could be simplified to: 所以你的功能可以简化为:
return (((length - 18) + 7) & ~7) + 18;
or simpler: 或者更简单:
return ((length - 11) & ~7) + 18;
GCC compiles the last line to simply: GCC简单地编译最后一行:
leal -11(%rdi), %eax
andl $-8, %eax
addl $18, %eax
Note that the lea
(Load Effective Address) instruciton is often "abused" for its ability to compute simple linear combinations like reg1 + size*reg2 + offset
请注意, lea
(加载有效地址)指令经常被“滥用”,因为它能够计算简单的线性组合,如reg1 + size*reg2 + offset
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.