[英]Why is there three leal instructions for this IA32 assembly code?
I compiled this C function: 我编译了这个C函数:
int calc(int x, int y, int z) {
return x + 3*y + 19*z;
}
And I got this in calc.s, and I am annotating what is happening: 我在calc.s中得到了这个,我正在注释正在发生的事情:
.file "calc.c"
.text
.globl calc
.type calc, @function
calc:
pushl %ebp //Save paramaters
movl %esp, %ebp //Move stack pointer into %ebp
movl 12(%ebp), %eax //Move y into %eax
movl 16(%ebp), %ecx //Move z into %ecx
leal (%eax,%eax,2), %eax //%eax = 3*y
addl 8(%ebp), %eax //%eax = x+3y
leal (%ecx,%ecx,8), %edx // ?
leal (%ecx,%edx,2), %edx // ?
addl %edx, %eax //%eax = (x+3*y)+(19*z)
popl %ebp //Pop the previous pointer
ret
.size calc, .-calc
.ident "GCC: (Ubuntu 4.3.3-5ubuntu4) 4.3.3"
.section .note.GNU-stack,"",@progbits
I understand everything up to the last two leal instructions. 我理解最后两个leal指令的一切。 Why do you need two leal instructions for 19*z whereas 3*y is accomplished in one instruction. 为什么你需要两个19 * z的leal指令,而3 * y是在一个指令中完成的。
leal
is a way to perform a multiplication by a small constant on a cheap, if the constant is a power of two plus one. 如果常数是2加1的幂,则leal
是一种以便宜的方式执行乘以小常数的方法。 The idea is that leal without an offset is equivalent to "Reg1 = Reg2+Reg3*Scale". 这个想法是没有偏移的leal相当于“Reg1 = Reg2 + Reg3 * Scale”。 If Reg2 and Reg3 happen to match, that means "Reg1=Reg2*(Scale+1). 如果Reg2和Reg3恰好匹配,则表示“Reg1 = Reg2 *(Scale + 1)。
leal
only supports scale factors up to 8, so to multiply by 19, you need two. leal
只支持最多8的比例因子,所以要乘以19,你需要两个。
The effect of 的效果
leal (%eax,%eax,2), %eax
is: 是:
eax = eax + eax*2
which is to say, multiplication by three. 也就是说,乘以三。
The second two leal
s together perform multiplication by 19: 第二个两个leal
一起执行乘以19:
leal (%ecx,%ecx,8), %edx // edx = ecx+ecx*8
leal (%ecx,%edx,2), %edx // edx = ecx+edx*2 (but edx is already z*9)
leal (%ecx,%ecx,8), %edx # edx = ecx + 8*ecx = 9*ecx = 9 * z
leal (%ecx,%edx,2), %edx
# edx = ecx + 2*edx = ecx + 2 * (ecx + 8*ecx) = z + 2 * 9 * z = 19 * z
The reason for this that the lea
instruction uses add and bitshifts and is faster then using mul
for integer multiplication. 这是因为lea
指令使用了add和bitshifts,并且比使用mul
进行整数乘法更快。 Lea is limited though to multiplication factors of 1, 2, 4 and 8 - thus two instructions. Lea仅限于1,2,4和8的倍增因子 - 因此有两条指令。
lea
serves a double purpose one is to calculate addresses but it can also be used for arithmetic with some constraints, as you observe with your code. lea
用于双重目的是计算地址,但它也可以用于具有一些约束的算术,就像您在代码中观察到的那样。 Two calls are needed because the scalar multiplier of lea
is limited to 1
, 2
, 4
or 8
which means to get your multiplication by 19
requires two calls to lea
: 需要两个呼叫,因为标量乘数lea
被限制为1
, 2
, 4
或8
,这意味着得到你乘以19
需要两次调用lea
:
[...]The scalar multiplier is limited to constant values 1, 2, 4, or 8 for byte, word, double word or quad word offsets respectively. [...]标量乘法器分别限制为字节,字,双字或四字偏移的常数值1,2,4或8。 This by itself allows for multiplication of a general register by constant values 2, 3, 4, 5, 8 and 9,[...] 这本身允许通用寄存器乘以常数值2,3,4,5,8和9,[...]
so in your case you have: 所以在你的情况下你有:
leal (%ecx,%ecx,8), %edx // edx = ecx + ecx*8 which is z*8 + z = z*9
leal (%ecx,%edx,2), %edx // edx = ecx + edx*2 which gives us (z*9)*2 + z
// for a total of 19z
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.