简体   繁体   中英

Swapping values using registers

I've been reading about swapping the content of variables without using a temporary variable and besides the famous xor algorithm I've found out about the XCHG instruction from assembly on x86. So I wrote this code:

void swap(int *left, int *right){
__asm__ __volatile__(
        "movl %0, %%eax;"
        "movl %1, %%ebx;"
        :
        : "r" (*left), "r" (*right)
    );
__asm__ __volatile__(
        "xchg %eax, %ebx;"
            );
__asm__ __volatile__(
        "movl %%eax, %0;"
        "movl %%ebx, %1;"
        : "=r" (*left), "=r" (*right)
    );}

It does work but then I realized the XCHG instruction is not necessary at all.

void swap(int *left, int *right){
__asm__ __volatile__(
        "movl %0, %%eax;"
        "movl %1, %%ebx;"
        :
        : "r" (*left), "r" (*right)
    );
__asm__ __volatile__(
        "movl %%ebx, %0;"
        "movl %%eax, %1;"
        : "=r" (*left), "=r" (*right)
    );}

The second function works too but nobody seems to mention swapping variables using registers so is this code considered wrong and in reality it's not really working properly? Am I missing something?

I realize this will work only for x86 but since most people have a intel x86 processor could this code be be used in any real world programming? I realize that this probably won't be any faster than a regular swap with a temporary variable but i'm asking from a theoretical point of view. If during a test or an interview someone asks me to write a function in C to swap values for a x86 machine without using a temporary variable would this code be valid or it's complete crap? Thanks you.

Valid, yes. By my criteria, you are a no-hire.

Why? Cost.

std::swap will do the job fine, and is probably fast enough. Your code will have a higher maintenance cost.

There certainly are times for dropping down into assembler for performance reasons.
This is not one of them.

First, your inline assembly is broken in many ways:

  • abuse of volatile , it does not mean what you wanted.
  • you don't tell the compiler you clobbered registers. (this can be fixed)
  • compiler is free to insert code in between your inline assembly block

Inline assembly is very difficult to get it right, for both programmer and compiler.

Also, inline assembly might be optimized with very careful hack, however it affect the compiler in ways that impair the optimizer's ability (register allocation, re-ordering, etc), which usually results in overall performance drop. I'm not against inline assembly (or compiler intrinsics), but it require very careful handling that make it not justify in most circumstances.

std::swap will compile to more efficient asm than this.

That code is slower than what a compiler would emit, as well as broken.

It clobbers EAX and EBX without telling the compiler, and will easily fail especially if compiled with optimization enabled, but unsafe even without optimization.

See How to write a short block of inline gnu extended assembly to swap the values of two integer variables? for an example of correct asm around xchg, and a better version that just uses constraints to swap C variable with zero asm instructions, leaving up to the compiler to figure out which registers it wants which C variable in. asm("" : "=r" (a), "=r" (b) : "1" (a), "0" (b));


Even if you did it with inline asm with zero asm statements and just opposite input/output constraints, you still destroy constant-propagation and range analysis, defeating optimization. https://gcc.gnu.org/wiki/DontUseInlineAsm . My edit to that answer I linked earlier includes a pure C swap, showing that constant-propagation works there but not with either inline-asm swap.

So even a correctly-written version of this is still useless for any real-world use, other than as an exercise / example in using input/output constraints for GNU C inline asm to avoid mov at the start/end of real asm blocks, and leave as much as possible up to the compiler.


Asm can only be a performance win if you do something the compiler can't already do better than you personally can. ( C++ code for testing the Collatz conjecture faster than hand-written assembly - why? )

Look at the compiler's asm output for a normal function ( How to remove "noise" from GCC/clang assembly output? ), and see

to understand more about what makes for efficient asm.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM