如何检测汇编语言 X86 中的溢出条件

Question

I have an assignment in which we have to write two functions.我有一个作业，我们必须在其中编写两个函数。 Also must detect overflow conditions using the processor's condition codes and return 0 to indicate that an error has been encountered.还必须使用处理器的条件代码检测溢出条件并返回0以指示遇到错误。 I was able to write the functions.我能够编写函数。

 .file  "formula.c"  
    .text
.globl _nCr  
    .def    _nCr;   .scl    2;  .type   32; .endef  
_nCr:  
        pushl   %ebp  
    movl    %esp, %ebp  
    subl    $56, %esp  
    movl    8(%ebp), %eax  
    movl    %eax, (%esp)  
    testl %eax, %eax  
    call    _factorial  
    movl    %eax, -12(%ebp)  
    movl    12(%ebp), %eax  
    addl    $1, %eax  
    movl    %eax, (%esp)  
    call    _factorial  
    movl    %eax, -16(%ebp)  
    movl    12(%ebp), %eax  
    notl    %eax  
    addl    8(%ebp), %eax  
    movl    %eax, (%esp)  
    call    _factorial  
    movl    %eax, -20(%ebp)  
    movl    -16(%ebp), %eax  
    movl    %eax, %edx  
    imull   -20(%ebp), %edx  
    movl    %edx, -28(%ebp)  
    movl    -12(%ebp), %eax  
    movl    %eax, %edx  
    sarl    $31, %edx  
    idivl   -28(%ebp)  
    leave  
    ret  
.globl _factorial   
    .def    _factorial;  .scl    2;     .type   32;     .endef   
_factorial:  
    pushl   %ebp  
    movl    %esp, %ebp  
    subl    $16, %esp  
    movl    $1, -8(%ebp)  
    movl    $1, -4(%ebp)  
    jmp L3  
L4:   
    movl    -8(%ebp), %eax   
    imull   -4(%ebp), %eax  
    movl    %eax, -8(%ebp)  
    addl    $1, -4(%ebp)   
L3:
    movl    -4(%ebp), %eax  
    cmpl    8(%ebp), %eax  
    jle L4  
    movl    -8(%ebp), %eax  
    leave  
    ret  
    .def    ___main;    .scl    2;  .type   32; .endef  
    .section .rdata,"dr"  
    .align 4

This function basically does nCr = n! / (r! (nr)!)这个函数基本上是nCr = n! / (r! (nr)!) nCr = n! / (r! (nr)!) . nCr = n! / (r! (nr)!) 。 The overflow occurs in factorial when the numbers get larger.当数字变大时，溢出发生在阶乘中。

I just do not understand how I would set the overflow conditions.我只是不明白我将如何设置溢出条件。

Answer 1

1) Your arithmetic commands are the operations that could potentially set the overflow bit 1）您的算术命令是可能设置溢出位的操作

2) The "JO" (jump on overflow) and "JNO" (jump on not overflow) allow you to branch, depending on whether an overflow occurred or not 2）“JO”（溢出时跳转）和“JNO”（不溢出时跳转）允许您进行分支，具体取决于是否发生溢出

3) You'd probably just set "%eax" to 0 after "JO". 3) 您可能只是在“JO”之后将“%eax”设置为 0。

4) Excellent, excellent resource if you're not already familiar with it: 4）如果您还不熟悉它，那么非常棒的资源：

Programming from the Ground Up, Jonathan Bartlett从头开始编程，Jonathan Bartlett

Answer 2

On the x86 architecture, when an arithmetic instruction executes such as addl 8(%ebp), %eax the condition codes are set in the CPU status word.在 x86 架构上，当算术指令执行时，例如addl 8(%ebp), %eax条件代码设置在 CPU 状态字中。 There are instructions whose behavior depends on condition codes.有些指令的行为取决于条件代码。

You can have the code take an alternate path (execute a branch) on a given condition.您可以让代码在给定条件下采用备用路径（执行分支）。 The x86 has a family of conditional branching instructions under the Jxx mnemonics: JA, JAE, JB, JBE, JC, JCXZ, ..., JZ . x86 在Jxx助记符下有Jxx条件分支指令： JA, JAE, JB, JBE, JC, JCXZ, ..., JZ 。 For instance JZ means jump if zero: take a branch if the instruction produced a zero result, setting the zero flag.例如JZ表示如果为零则跳转：如果指令产生零结果则进行分支，设置零标志。 JO is jump on overflow. JO是在溢出时跳转。

A condition can also be converted to a byte datum and stored into a register or memory.条件也可以转换为字节数据并存储到寄存器或存储器中。 This is useful for compiling C expressions like:这对于编译 C 表达式很有用，例如：

 x = (y != 3); /* if (y != 3) x = 1; else x = 0 */

It is done by the SETx group of instructions which are also numerous, like the conditional branches: SETA, SETAE, SETB, ..., SETZ .它是由SETx指令组完成的，这些指令也很多，如条件分支： SETA, SETAE, SETB, ..., SETZ 。 For instance SETZ will set a given byte to 1 if the zero condition is true.例如，如果零条件为真，SETZ 会将给定字节设置为 1。 Eg例如

 seto %bl  /* set bottom byte of B register to 1 if overflow flag set */

Answer 3

Most instructions set OF on signed overflow, or CF on unsigned overflow.大多数指令在有符号溢出时设置 OF，或在无符号溢出时设置 CF。 http://teaching.idallen.com/dat2343/10f/notes/040_overflow.txt explains for add/sub. http://teaching.idallen.com/dat2343/10f/notes/040_overflow.txt解释了添加/订阅。 (Bitwise boolean instructions like and/or/xor can't overflow so they always clear CF and OF). （像和/或/xor 这样的位布尔指令不能溢出，所以它们总是清除 CF 和 OF）。

imul sets both OF and CF when the full result isn't the sign extension of the low half result (same width as the inputs).当完整结果不是低半结果的符号扩展（与输入相同的宽度）时， imul设置 OF 和 CF。 This applies even for the efficient 2-operand form that don't write the high half anywhere;这甚至适用于不在任何地方写入高半部分的高效 2 操作数形式； they still set flags according to what it would be.他们仍然根据它将是什么设置标志。 If you unsigned overflow detection for multiply, you need to use the clumsy one-operand mul .如果您对乘法进行无符号溢出检测，则需要使用笨拙的单操作数mul 。

Division raises a #DE exception when the quotient doesn't fit into AL/AX/EAX/RAX (depending on operand-size).当商不适合 AL/AX/EAX/RAX（取决于操作数大小）时，除法会引发 #DE 异常。 Unfortunately there's no way to suppress / mask this, so you can't try 2N / N => N-bit division with a large dividend and detect overflow after the fact, unless you have a signal handler to catch SIGFPE (on POSIX OSes).不幸的是，没有办法抑制/屏蔽这一点，所以你不能尝试 2N / N => N 位除法并在事后检测溢出，除非你有一个信号处理程序来捕捉 SIGFPE（在 POSIX 操作系统上） . Or on bare metal, an interrupt handler for #DE.或者在裸机上，#DE 的中断处理程序。

For combinatorials specifically:特别是对于组合：

Instead of naively calculating n!而不是天真地计算n! , you can cancel earlier and just compute prod(r+1 .. n) . ，您可以提前取消并只计算prod(r+1 .. n) 。 Actually use the larger or r or nr , and divide by the other one.实际上使用较大的或r或nr ，然后除以另一个。

You still end with dividing by a potentially large number so this doesn't eliminate the chance of overflow for all possible results that fit in a 32-bit integer.您仍然以除以一个潜在的大数结束，因此这不会消除所有可能的结果都适合 32 位整数的溢出机会。 But it extends the range you can handle simply, and of course is faster because you're doing fewer multiplies.但它扩展了您可以简单处理的范围，当然更快，因为您执行的乘法更少。 eg C(999, 1000) only does 1000 / (1000-999)!例如C(999, 1000)只做1000 / (1000-999)! so no multiplies and only one div .所以没有乘法，只有一个div 。

If you do the last multiply of the product with a mul instruction to produce a 64-bit result in EDX:EAX, you can use that as the 64-bit dividend for a 32-bit division.如果您使用mul指令对乘积进行最后一次乘法以在 EDX:EAX 中生成 64 位结果，则可以将其用作 32 位除法的 64 位被除数。 (If you want to risk an exception.) （如果你想冒一个例外的风险。）

mul is NxN => 2N multiplication, so if you just use it in a loop it ignores the high half of the previous output. mul是 NxN => 2N 乘法，所以如果你只是在循环中使用它，它会忽略前一个输出的高半部分。 If you multiply in low-to-high order so the last multiply is the high end of the range, that gives you the biggest possible range for this to work.如果您按从低到高的顺序进行乘法，那么最后一个乘法是范围的高端，这将为您提供最大可能的范围。

So for example you might do例如，你可能会这样做

// mix of C pseudocode and AT&T 32-bit.
//  Some real registers, some var names: pick registers for those.

   if (n == r) return 1;
   divisor = factorial(min(r, n-r));   // and save in another register until later
   eax = max(r,n-r) + 1;               // prod

   xor  %edx, %edx        # in case we skip the mul
   cmp  n, %eax
   jae  endprod           # loop might need to run 0 times, but n==r case already handled

   lea  1(%eax), %ecx     # i = low+1.  Not overflow-checked
   jmp  loop_entry

 prod_loop:                  # do{
    imul  %ecx, %eax         # prod *= i for i=[max+2 .. n-1]
     jo  overflow_in_prod    # maybe need mul to avoid spurious signed but not unsigned overflow cases
    inc   %ecx
  loop_entry:
    cmp   n, %ecx
    jb    prod_loop          # }while(i<n)
                           # leave the loop with ECX = n, with one multiply left to do
    mul   %ecx             # EDX:EAX = n * EAX
    # We're keeping the full 64-bit result, therefore this can't overflow
 endprod:

    div   divisor          # prod /= the smaller factorial
    # EDX:EAX / divisor, quotient in EAX.  Or will raise #DE if it didn't fit.
    ret

overflow_in_prod:
   do something
   ret

Untested and not carefully thought through, might be off-by-one errors / corner case bugs in the loop setup / bounds.未经测试且未仔细考虑，可能是循环设置/边界中的一对一错误/极端情况错误。

This is the kind of thing I was describing: we can check for overflow while accumulating the product, except for the last step which we allow to produce a 64-bit result.这就是我所描述的事情：我们可以在累加乘积的同时检查溢出，除了我们允许产生 64 位结果的最后一步。

There are probably cases where the prod loop's last imul will produce a 32-bit unsigned result with the high bit set, but no unsigned overflow.在某些情况下，prod 循环的最后一个imul会产生一个 32 位无符号结果，其中设置了高位，但没有无符号溢出。 Using imul/jo would spuriously detect that as overflow, because it is signed overflow.使用 imul/jo 会错误地将其检测为溢出，因为它是有符号溢出。 And where the final div wouldn't overflow.并且最终 div 不会溢出。 So if you care about that more than speed, use a (slightly slower) mul there, too.所以如果你关心的不仅仅是速度，也可以在那里使用（稍微慢一点） mul 。

Anyway, this lets us handle C(18, 9) where prod(10 .. 18) = 0x41b9e4200 .无论如何，这让我们可以处理C(18, 9)其中 prod(10 .. 18) = 0x41b9e4200 。 The last imul will produce EAX = 0x3a6c5900 which fits in 32 bits, and the final mul will multiply that by 18 to produce EDX:EAX = 0x41b9e4200 (35 significant bits).最后一个 imul 将产生 EAX = 0x3a6c5900 ，它适合 32 位，最后的mul将乘以 18 以产生 EDX:EAX = 0x41b9e4200 （35 个有效位）。 Then we divide that by 9! = 0x58980然后我们将其除以9! = 0x58980 9! = 0x58980 and get EAX = 0xbdec. 9! = 0x58980并获得 EAX = 0xbdec。

The number of significant bits in EDX:EAX can be even larger when n and r are large (but close together so we still avoid overflow).当n和r很大时，EDX:EAX 中的有效位数可能更大（但靠在一起，所以我们仍然避免溢出）。 They have to be far enough apart for the (nr)!对于(nr)!它们必须相距足够远(nr)! divisor to be large enough to bring the final result back down to fit in 32 bits, though.但是，除数要大到足以将最终结果降低到适合 32 位。

Otherwise you'd need extended-precision division, which is possible...否则你需要扩展精度除法，这是可能的......

如何检测汇编语言 X86 中的溢出条件

问题描述

3 个解决方案

解决方案1
3 2012-03-21 01:58:59

解决方案2
3 2012-03-21 01:59:09

解决方案3
1 2020-03-01 21:40:33

For combinatorials specifically:特别是对于组合：

如何检测汇编语言 X86 中的溢出条件

问题描述

3 个解决方案

解决方案1 3 2012-03-21 01:58:59

解决方案2 3 2012-03-21 01:59:09

解决方案3 1 2020-03-01 21:40:33

For combinatorials specifically:特别是对于组合：

解决方案1
3 2012-03-21 01:58:59

解决方案2
3 2012-03-21 01:59:09

解决方案3
1 2020-03-01 21:40:33