简体   繁体   中英

How does including assembly inline with C code work?

I've seen code for Arduino and other hardware that have assembly inline with C, something along the lines of:

asm("movl %ecx %eax"); /* moves the contents of ecx to eax */
__asm__("movb %bh (%eax)"); /*moves the byte from bh to the memory pointed by eax */

How does this actually Work ? I realize every compiler is different, but what are the common reasons this is done, and how could someone take advantage of this?

The inline assembler code goes right into the complete assembled code untouched and in one piece. You do this when you really need absolutely full control over your instruction sequence, or maybe when you can't afford to let an optimizer have its way with your code. Maybe you need every clock tick. Maybe you need every single branch of your code to take the exact same number of clock ticks, and you pad with NOPs to make this happen.

In any case, lots of reasons why someone may want to do this, but you really need to know what you're doing. These chunks of code will be pretty opaque to your compiler, and its likely you won't get any warnings if you're doing something bad.

Usually the compiler will just insert the assembler instructions right into its generated assembler output. And it will do this with no regard for the consequences.

For example, in this code the optimiser is performing copy propagation, whereby it sees that y=x, then z=y. So it replaces z=y with z=x, hoping that this will allow it to perform further optimisations. Howver, it doesn't spot that I've messed with the value of x in the mean time.

char x=6;
char y,z;

y=x;                 // y becomes 6

_asm                    
    rrncf x, 1       // x becomes 3. Optimiser doesn't see this happen!
_endasm  

z=y;                 // z should become 6, but actually gets
                     // the value of x, which is 3

To get around this, you can essentially tell the optimiser not to perform this optimisation for this variable.

volatile char x=6;   // Tell the compiler that this variable could change
                     // all by itself, and any time, and therefore don't
                     // optimise with it.
char y,z;

y=x;                 // y becomes 6

_asm                    
    rrncf x, 1       // x becomes 3. Optimiser doesn't see this happen!
_endasm  

z=y;                 // z correctly gets the value of y, which is 6

Historically, C compilers generated assembly code, which would then be translated to machine code by an assembler. Inline assembly arises as a simple feature — in the intermediate assembly code, at that point, inject some user-picked code. Some compilers directly generate machine code, in which case they contain an assembler or call an external assembler to generate the machine code for the inline assembly snippets.

The most common use for assembly code is to use specialized processor instructions that the compiler isn't able to generate. For example, disabling interrupts for a critical section, controlling processor features (cache, MMU, MPU, power management, querying CPU capabilities, …), accessing coprocessors and hardware peripherals (eg inb / outb instructions on x86), etc. You'll rarely find asm("movl %ecx %eax") , because that affects general-purpose registers that the C code around it is also using, but something like asm("mcr p15, 0, 0, c7, c10, 5") has its use (data memory barrier on ARM). The OSDev wiki has several examples with code snippets.

Assembly code is also useful to implement features that break C's flow control model. A common example is context switching between threads (whether cooperative or preemptive, whether in the same address space or not) requiring assembly code to save and restore register values.

Assembly code is also useful to hand-optimize small bits of code for memory or speed. As compilers are getting smarter, this is rarely relevant at the application level nowadays, but it's still relevant in much of the embedded world.

There are two ways to combine assembly with C: with inline assembly, or by linking assembly modules with C modules. Linking is arguably cleaner but not always applicable: sometimes you need that one instruction in the middle of a function (eg for register saving on a context switch, a function call would clobber the registers), or you don't want to pay the cost of a function call.

Most C compilers support inline assembly, but the syntax varies. It is typically introduced by the keyword asm , _asm , __asm or __asm__ . In addition to the assembly code itself, the inline assembly construct may contain additional code that allows you to pass values between assembly and C (for example, requesting that the value of a local variable is copied to a register on entry), or to declare that the assembly code clobbers or preserves certain registers.

asm("") and __asm__ are both valid usage. Basically, you can use __asm__ if the keyword asm conflicts with something in your program. If you have more than one instructions, you can write one per line in double quotes, and also suffix a '\\n' and '\\t' to the instruction. This is because gcc sends each instruction as a string to as(GAS) and by using the newline/tab you can send correctly formatted lines to the assembler. The code snippet in your question is basic inline .

In basic inline assembly , there is only instructions . In extended assembly , you can also specify the operands . It allows you to specify the input registers, output registers and a list of clobbered registers. It is not mandatory to specify the registers to use, you can leave that to GCC and that probably fits into GCC's optimization scheme better. An example for the extended asm is:

__asm__ ("movl %eax, %ebx\n\t"
           "movl $56, %esi\n\t"
           "movl %ecx, $label(%edx,%ebx,$4)\n\t"
           "movb %ah, (%ebx)");

Notice that the '\\n\\t' at the end of each line except the last, and each line is enclosed in quotes. This is because gcc sends each as instruction to as as a string as I mentioned before. The newline/tab combination is required so that the lines are fed to as according to the correct format.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM