简体   繁体   中英

What ensures reads/writes of operands occurs at desired timed with extended ASM?

According to GCC's Extended ASM and Assembler Template , to keep instructions consecutive, they must be in the same ASM block. I'm having trouble understanding what provides the scheduling or timings of reads and writes to the operands in a block with multiple statements.

As an example, EBX or RBX needs to be preserved when using CPUID because, according to the ABI, the caller owns it. There are some open questions with respect to the use of EBX and RBX , so we want to preserve it unconditionally (its a requirement). So three instructions need to be encoded into a single ASM block to ensure the consecutive-ness of the instructions (re: the assembler template discussed in the first paragraph):

unsigned int __FUNC = 1, __SUBFUNC = 0;
unsigned int __EAX, __EBX, __ECX, __EDX;

__asm__ __volatile__ (

  "push %ebx;"
  "cpuid;"
  "pop %ebx"
  : "=a"(__EAX), "=b"(__EBX), "=c"(__ECX), "=d"(__EDX)
  : "a"(__FUNC), "c"(__SUBFUNC)

);

If the expression representing the operands is interpreted at the wrong point in time, then __EBX will be the saved EBX (and not the CPUID 's EBX ), which will likely be a pointer to the Global Offset Table (GOT) if PIC is enabled.

Where, exactly, does the expression specify that the store of CPUID 's %EBX into __EBX should happen (1) after the PUSH %EBX ; (2) after the CPUID ; but (3) before the POP %EBX ?

In your question you present some code that does a push and pop of ebx . The idea of saving ebx in the event that you compile with gcc using -fPIC (position independent code) is correct. It is up to our function not to clobber ebx upon return in that situation. Unfortunately the way you have defined the constraints you explicitly use ebx . Generally the compiler will warn you ( error: inconsistent operand constraints in an 'asm' ) if you are using PIC code and you specify =b as an output constraint. Why it doesn't produce a warning for you is unusual.

To get around this problem you can let the assembler template choose a register for you. Instead of pushing and popping we simply exchange %ebx with an unused register chosen by the compiler and restore it by exchanging it back after. Since we don't wish to have the compiler clobber our input registers during the exchange we specify early clobber modifier, thus ending up with a constraint of =&r (instead of =b in the OPs code). More on modifiers can be found here . Your code (for 32 bit) would look something like:

unsigned int __FUNC = 1, __SUBFUNC = 0;
unsigned int __EAX, __EBX, __ECX, __EDX;

__asm__ __volatile__ (
       "xchgl\t%%ebx, %k1\n\t"      \
       "cpuid\n\t"                  \
       "xchgl\t%%ebx, %k1\n\t"

  : "=a"(__EAX), "=&r"(__EBX), "=c"(__ECX), "=d"(__EDX)
  : "a"(__FUNC), "c"(__SUBFUNC));

If you intend to compile for X86_64 (64 bit) you'll need to save the entire contents of %rbx . The code above will not quite work. You'd have to use something like:

uint32_t  __FUNC = 1, __SUBFUNC = 0;
uint32_t __EAX, __ECX, __EDX;
uint64_t __BX; /* Big enough to hold a 64 bit value */

__asm__ __volatile__ (
       "xchgq\t%%rbx, %q1\n\t"      \
       "cpuid\n\t"                  \
       "xchgq\t%%rbx, %q1\n\t"

  : "=a"(__EAX), "=&r"(__BX), "=c"(__ECX), "=d"(__EDX)
  : "a"(__FUNC), "c"(__SUBFUNC));

You could code this up using conditional compilation to deal with both X86_64 and i386:

uint32_t  __FUNC = 1, __SUBFUNC = 0;
uint32_t __EAX, __ECX, __EDX;
uint64_t __BX; /* Big enough to hold a 64 bit value */

#if defined(__i386__)
    __asm__ __volatile__ (
           "xchgl\t%%ebx, %k1\n\t"      \
           "cpuid\n\t"                  \
           "xchgl\t%%ebx, %k1\n\t"

      : "=a"(__EAX), "=&r"(__BX), "=c"(__ECX), "=d"(__EDX)
      : "a"(__FUNC), "c"(__SUBFUNC));

#elif defined(__x86_64__)
    __asm__ __volatile__ (
           "xchgq\t%%rbx, %q1\n\t"      \
           "cpuid\n\t"                  \
           "xchgq\t%%rbx, %q1\n\t"

      : "=a"(__EAX), "=&r"(__BX), "=c"(__ECX), "=d"(__EDX)
      : "a"(__FUNC), "c"(__SUBFUNC));
#else
#error "Unknown architecture."
#endif

GCC has a __cpuid macro defined in cpuid.h . It defined the macro so that it only saves the ebx and rbx register when required. You can find the GCC 4.8.1 macro definition here to get an idea of how they handle cpuid in cpuid.h .

The astute reader may ask the question - what stops the compiler from choosing ebx or rbx as the scratch register to use for the exchange. The compiler knows about ebx and rbx in the context of PIC, and will not allow it to be used as a scratch register. This is based on my personal observations over the years and reviewing the assembler (.s) files generated from C code. I can't say for certain how more ancient versions of gcc handled it so it could be a problem.

I think you understand, but to be clear, the "consecutive" rule means that this:

asm ("a");
asm ("b");
asm ("c");

... might get other instructions interposed, so if that's not desirable then it must be rewritten like this:

asm ("a\n"
     "b\n"
     "c");

... and now it will be inserted as a whole.


As for the cpuid snippet, we have two problems:

  1. The cpuid instruction will overwrite ebx , and hence clobber the data that PIC code must keep there.

  2. We want to extract the value that cpuid places in ebx while never returning to compiled code with the "wrong" ebx value.

One possible solution would be this:

unsigned int __FUNC = 1, __SUBFUNC = 0;
unsigned int __EAX, __EBX, __ECX, __EDX;

__asm__ __volatile__ (    
  "push %ebx;"
  "cpuid;"
  "mov %ebx, %ecx"
  "pop %ebx"
  : "=c"(__EBX)
  : "a"(__FUNC), "c"(__SUBFUNC)
  : "eax", "edx"
);
__asm__ __volatile__ (    
  "push %ebx;"
  "cpuid;"
  "pop %ebx"
  : "=a"(__EAX), "=c"(__ECX), "=d"(__EDX)
  : "a"(__FUNC), "c"(__SUBFUNC)
);

There's no need to mark ebx as clobbered as you're putting it back how you found it.

(I don't do much Intel programming, so I may have some of the assembler-specific details off there, but this is how asm works.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM