简体   繁体   中英

A reference for converting assembly 'shl', 'OR', 'AND', 'SHR' operations into C?

I'm to convert the following AT&T x86 assembly into C:

  movl 8(%ebp), %edx
  movl $0, %eax
  movl $0, %ecx
  jmp .L2
.L1
  shll $1, %eax
  movl %edx, %ebx
  andl $1, %ebx
  orl %ebx, %eax
  shrl $1, %edx
  addl $1, %ecx
.L2
  cmpl $32, %ecx
  jl   .L1
  leave

But must adhere to the following skeleton code:

int f(unsigned int x) {
    int val = 0, i = 0;
    while(________) {
        val = ________________;
        x = ________________;
        i++;
    }
    return val;
}

I can tell that the snippet

.L2
  cmpl $32, %ecx
  jl   .L1

can be interpreted as while(i<32) . I also know that x is stored in %edx , val in %eax , and i in %ecx . However, I'm having a hard time converting the assembly within the while / .L1 loop into condensed high-level language that fits into the provided skeleton code. For example, can shll , shrl , orl , and andl simply be written using their direct C equivalents ( << , >> , | , & ), or is there some more nuance to it?

Is there a standardized guide/"cheat sheet" for Assembly-to-C conversions?

I understand assembly to high-level conversion is not always clear-cut, but there are certainly patterns in assembly code that can be consistently interpreted as certain C operations.

For example, can shll, shrl, orl, and andl simply be written using their direct C equivalents (<<,>>,|,&), or is there some more nuance to it?

they can. Let's examine the loop body step-by-step:

  shll $1, %eax    // shift left eax by 1, same as "eax<<1" or even "eax*=2"
  movl %edx, %ebx
  andl $1, %ebx    // ebx &= 1
  orl %ebx, %eax   // eax |= ebx
  shrl $1, %edx    // shift right edx by 1, same as "edx>>1" = "edx/=2"

gets us to

  %eax *=2
  %ebx = %edx        
  %ebx = %ebx & 1       
  %eax |= %ebx     
  %edx /= 2

ABI tells us ( 8(%ebp), %edx ) that %edx is x, and %eax (return value) is val:

  val *=2
  %ebx = x           // a
  %ebx = %ebx & 1    // b
  val |= %ebx        // c
  x /= 2

combine a,b,c: #2 insert a into b:

  val *=2
  %ebx = (x & 1)  // b
  val |= %ebx     // c
  x /= 2

combine a,b,c: #2 insert b into c:

  val *=2
  val |= (x & 1)
  x /= 2

final step: combine both 'val =' into one

  val = 2*val | (x & 1)
  x /= 2

while (i < 32) { val = (val << 1) | (x & 1); x = x >> 1; i++; } while (i < 32) { val = (val << 1) | (x & 1); x = x >> 1; i++; } except val and the return value should be unsigned and they aren't in your template. The function returns the bits in x reversed.

The actual answer to your question is more complicated and is pretty much: no there is no such guide and it can't exist because compilation loses information and you can't recreate that lost information from assembler. But you can often make a good educated guess.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM