简体   繁体   中英

Hello World using x86 assembler on Mac 0SX

I am trying to dive into some x86 assembly programming on my Mac, but am having trouble producing an executable. The problem seems to be at the linking stage.

helloWorld.s:

.data

    HelloWorldString:
    .ascii "Hello World\n"

.text

.globl _start

_start:
    # load all the arguments for write()
    movl $4, %eax
    movl $1, %ebx
    movl $HelloWorldString, %ecx
    movl $12, %edx
    # raises software interrupt to call write()
    int $0x80

    # call exit()
    movl $1, %eax
    movl $0, %ebx
    int $0x80

Assemble the program:

as -o helloWorld.o helloWorld.s

Link the object file:

ld -o helloWorld helloWorld.o

The error I get at this point is:

ld: could not find entry point "start" (perhaps missing crt1.o) for inferred architecture x86_64

Any advice on what I'm doing wrong / missing would be very helpful. thanks

You'll probably find it easier to build with gcc rather than trying to micro-manage the assembler and linker, eg

$ gcc helloWorld.s -o helloWorld

(You'll probably want to change _start to _main if you go this route.)

Incidentally, it can be instructive to start with a working C program, and study the generated asm from this. Eg

#include <stdio.h>

int main(void)
{
    puts("Hello world!\n");

    return 0;
}

when compiled with gcc -Wall -O3 -m32 -fno-PIC hello.c -S -o hello.S generates:

    .cstring
LC0:
    .ascii "Hello world!\12\0"
    .text
    .align 4,0x90
.globl _main
_main:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $24, %esp
    movl    $LC0, (%esp)
    call    _puts
    xorl    %eax, %eax
    leave
    ret
    .subsections_via_symbols

You might want to consider using this as a template for your own "Hello world" or other experimental asm programs, especially given that it already builds and runs:

$ gcc -m32 hello.S -o hello
$ ./hello 
Hello world!

One final comment: beware of taking examples from Linux-oriented asm books or tutorials and trying to apply them under OS X - there are important differences !

Try:

ld -e _start -arch x86_64 -o HelloWorld HelloWorld.S

then:

./HelloWorld

Info:

-e <entry point>
-arch <architecture>, You can check your architecture by uname -a 
-o <output file>

hello.asm

.data

    HelloWorldString:
    .ascii "Hello World!\n"

.text

.globl start

start:
    ; load all the arguments for write()
    movl $0x2000004, %eax
    movl $1, %ebx
    movq HelloWorldString@GOTPCREL(%rip), %rsi
    movq $100, %rdx
    ; raises software interrupt to call write()
    syscall

    ; call exit()
    movl $0x2000001, %eax
    movl $0, %ebx
    syscall

Then run:

$ as -arch x86_64  -o hello.o hello.asm
$ ld -o hello hello.o
$ ./hello

This is a working solution for Mac OS X Mach-0 GNU-based assemblers

The code in the question looks like it's for 32-bit Linux using the int $0x80 ABI with args in EBX, ECX, EDX.

x86-64 code on MacOS uses the syscall instruction, with arg-passing and return value similar to what's documented in the x86-64 System V ABI for Linux. It's completely different from int $0x80 , the only similarity being that the call number is passed in EAX/RAX. But the call numbers are different: https://sigsegv.pl/osx-bsd-syscalls/ ORed with 0x2000000 .

Args go in the same registers as for function calls. (except R10 instead of RCX.)

See also basic assembly not working on Mac (x86_64+Lion)? and How to get this simple assembly to run?


I think this is a lot neater and more intuitive version of what was suggested in another answer.

OS X uses start , not _start , for the process entry point.

.data
str:
  .ascii "Hello world!\n"
  len = . - str                  # length = start - end.   . = current position

.text
.globl start
start:
    movl   $0x2000004, %eax
    movl   $1, %edi
    leaq   str(%rip), %rsi  
    movq   $len, %rdx          
    syscall                       # write(1, str, len)

    movl   $0x2000001, %eax 
    movl   $0, %edi
    syscall                       # _exit(0)

Normally you'd omit the operand-size suffix when a register implies it. And use xor %edi,%edi to zero RDI.

And use mov $len, %edx because you know the size is smaller than 4GB so a more efficient 32-bit zero-extended mov-immediate will work, like you're doing to set RAX to the call number.

Notice the use of a RIP-relative LEA to get the address of static data into a register. x86-64 code on MacOS can't use 32-bit absolute addressing because the base address where your executable will be mapped is above 2^32.

There are no relocation types for 32-bit absolute addresses so you can't use them. (And you want RIP-relative, not 64-bit absolute, even though that's also supported.)

To assemble and link the above code on macOS 10.15 the following changes need to be made.

Change .global _start to .global main and _start to main

To assemble and link the code use:

as -arch x86_64 -o hello.o hello.asm
ld -arch x86_64 -o hello hello.o -lSystem

This is assuming that "Apple clang version 12.0.0" of "as" is being used and the corresponding "ld" is used.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM