[英]Trying to translate a C function to x86_64 AT&T assembly
I've been trying to translate this function to assembly:我一直在尝试将这个 function 翻译成汇编:
void foo (int a[], int n) {
int i;
int s = 0;
for (i=0; i<n; i++) {
s += a[i];
if (a[i] == 0) {
a[i] = s;
s = 0;
}
}
}
But something is going wrong.但是出了点问题。
That's what I've done so far:这就是我到目前为止所做的:
.section .text
.globl foo
foo:
.L1:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl $0, -16(%rbp) /*s*/
movl $0, -8(%rbp) /*i*/
jmp .L2
.L2:
cmpl -8(%rbp), %esi
jle .L4
leave
ret
.L3:
addl $1, -8(%rbp)
jmp .L2
.L4:
movl -8(%rbp), %eax
imull $4, %eax
movslq %eax, %rax
addq %rdi, %rax
movl (%rax), %eax
addl %eax, -16(%rbp)
cmpl $0, %eax
jne .L3
/* if */
leaq (%rax), %rdx
movl -16(%rbp), %eax
movl %eax, (%rdx)
movl $0, -16(%rbp)
jmp .L3
I am compiling the.s module with a.c module, for example, with an int nums [5] = {65, 23, 11, 0, 34}
and I'm getting back the same array instead of {65, 23, 11, 99, 34}
.我正在使用 a.c 模块编译 the.s 模块,例如,使用
int nums [5] = {65, 23, 11, 0, 34}
我得到相同的数组而不是{65, 23, 11, 99, 34}
。
Could someone help me?有人可以帮我吗?
Presumably you have a compiler that can generate AT&T syntax.假设您有一个可以生成 AT&T 语法的编译器。 It might be more instructive to look at what assembly output the compiler generates.
查看编译器生成的程序集 output 可能更有指导意义。 Here's my re-formulation of your demo:
这是我对您的演示的重新表述:
#include <stdio.h>
void foo (int a[], int n)
{
for (int s = 0, i = 0; i < n; i++)
{
if (a[i] != 0)
s += a[i];
else
a[i] = s, s = 0;
}
}
int main (void)
{
int nums[] = {65, 23, 11, 0, 34};
int size = sizeof(nums) / sizeof(int);
foo(nums, size);
for (int i = 0; i < size; i++)
fprintf(stdout, i < (size - 1) ? "%d, " : "%d\n", nums[i]);
return (0);
}
Compiling without optimizations enabled is typically harder to work through than optimized code, since it loads from and spills results to memory. You won't learn much from it if you're investing time in learning how to write efficient assembly.在未启用优化的情况下进行编译通常比优化代码更难完成,因为它从 memory 加载结果并将结果溢出到 memory。如果您花时间学习如何编写高效的汇编,您将不会从中学到很多东西。
Compiling with the Godbolt compiler explorer with -O2 optimizations yields much more efficient code;使用带有-O2优化的Godbolt 编译器资源管理器编译会产生更高效的代码; it's also useful for cutting out unnecessary directives, labels, etc., that would be visual noise in this case.
它对于删除不必要的指令、标签等也很有用,在这种情况下它们会成为视觉噪音。
In my experience, using -O2 optimizations are clever enough to make you rethink your use of registers, refactoring, etc. -O3 can sometimes optimize too agressively - unrolling loops, vectorizing, etc., to easily follow.根据我的经验,使用-O2优化足够聪明,可以让您重新考虑对寄存器、重构等的使用。- O3有时可能过于激进地进行优化 - 展开循环、矢量化等,很容易跟进。
Finally, for the case you have presented, there's a perfect compromise: -Os , which enables many of the optimizations of -O2 , but not at the expense of increased code size.最后,对于您提出的案例,有一个完美的折衷方案: -Os ,它可以实现-O2的许多优化,但不会以增加代码大小为代价。 I'll paste the assembly here just for comparative purposes:
出于比较目的,我将在此处粘贴程序集:
foo:
xorl %eax, %eax
xorl %ecx, %ecx
.L2:
cmpl %eax, %esi
jle .L7
movl (%rdi,%rax,4), %edx
testl %edx, %edx
je .L3
addl %ecx, %edx
jmp .L4
.L3:
movl %ecx, (%rdi,%rax,4)
.L4:
incq %rax
movl %edx, %ecx
jmp .L2
.L7:
ret
Remember that the calling convention passes the pointer to (a)
in %rdi
, and the 'count' (n)
in %rsi
.请记住,调用约定将指针传递给
%rdi
中的 ( (a)
) 和%rsi
中的“计数” (n)
。 These are the calling conventions being used.这些是正在使用的调用约定。 Notice that your code does not 'dereference' or 'index' any elements through
%rdi
.请注意,您的代码不会通过
%rdi
“取消引用”或“索引”任何元素。 It's definitely worth going stepping through the code - even with pen and paper if it helps - to understand the branch conditions and how reading and writing is performed on element a[i]
.单步执行代码(如果有帮助的话,即使使用笔和纸)绝对值得了解分支条件以及如何对元素
a[i]
执行读取和写入。
Curiously, using the inner loop of your code:奇怪的是,使用代码的内部循环:
s += a[i];
if (a[i] == 0)
a[i] = s, s = 0;
Appears to generate more efficient code with -Os than the inner loop I used:与我使用的内部循环相比,使用-Os似乎可以生成更高效的代码:
foo:
xorl %eax, %eax
xorl %edx, %edx
.L2:
cmpl %eax, %esi
jle .L6
movl (%rdi,%rax,4), %ecx
addl %ecx, %edx
testl %ecx, %ecx
jne .L3
movl %edx, (%rdi,%rax,4)
xorl %edx, %edx
.L3:
incq %rax
jmp .L2
.L6:
ret
A reminder for me to keep things simple!提醒我保持简单!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.