For the sake of curiosity I'm trying to read the flag register and print it out in a nice way.
I've tried reading it using gcc's asm keyword, but i can't get it to work. Any hints how to do it? I'm running a Intel Core 2 Duo and Mac OS X. The following code is what I have. I hoped it would tell me if an overflow happened:
#include <stdio.h>
int main (void){
int a=10, b=0, bold=0;
printf("%d\n",b);
while(1){
a++;
__asm__ ("pushf\n\t"
"movl 4(%%esp), %%eax\n\t"
"movl %%eax , %0\n\t"
:"=r"(b)
:
:"%eax"
);
if(b!=bold){
printf("register changed \n %d\t to\t %d",bold , b);
}
bold = b;
}
}
This gives a segmentation fault. When I run gdb on it I get this:
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x000000005fbfee5c
0x0000000100000eaf in main () at asm.c:9
9 asm ("pushf \n\t"
You can use the PUSHF/PUSHFD/PUSHFQ instruction (see http://siyobik.info/main/reference/instruction/PUSHF%2FPUSHFD for details) to push the flag register onto the stack. From there on you can interpret it in C. Otherwise you can test directly (against the carry flag for unsigned arithmetic or the overflow flag for signed arithmetic) and branch.
(to be specific, to test for the overflow bit you can use JO (jump if set) and JNO (jump if not set) to branch -- it's bit #11 (0-based) in the register)
About the EFLAGS bit layout: http://en.wikibooks.org/wiki/X86_Assembly/X86_Architecture#EFLAGS_Register
A very crude Visual C syntax test (just wham-bam / some jumps to debug flow), since I don't know about the GCC syntax:
int test2 = 2147483647; // max 32-bit signed int (0x7fffffff)
unsigned int flags_w_overflow, flags_wo_overflow;
__asm
{
mov ebx, test2 // ebx = test value
// test for no overflow
xor eax, eax // eax = 0
add eax, ebx // add ebx
jno no_overflow // jump if no overflow
testoverflow:
// test for overflow
xor ecx, ecx // ecx = 0
inc ecx // ecx = 1
add ecx, ebx // overflow!
pushfd // store flags (32 bits)
jo overflow // jump if overflow
jmp done // jump if not overflown :(
no_overflow:
pushfd // store flags (32 bits)
pop edx // edx = flags w/o overflow
jmp testoverflow // back to next test
overflow:
jmp done // yeah we're done here :)
done:
pop eax // eax = flags w/overflow
mov flags_w_overflow, eax // store
mov flags_wo_overflow, edx // store
}
if (flags_w_overflow & (1 << 11)) __asm int 0x3 // overflow bit set correctly
if (flags_wo_overflow & (1 << 11)) __asm int 0x3 // overflow bit set incorrectly
return 0;
The compiler can reorder instructions, so you cannot rely on your lahf
being next to the increment. In fact, there may not be an increment at all. In your code, you don't use the value of a
, so the compiler can completely optimize it out.
So, either write the increment + check in assembler, or write it in C.
Also, lahf
loads only ah
(8 bits) from eflags
, and the Overflow flag is outside of that. Better use pushf; pop %eax
pushf; pop %eax
.
Some tests:
#include <stdio.h>
int main (void){
int a=2147483640, b=0, bold=0;
printf("%d\n",b);
while(1){
a++;
__asm__ __volatile__ ("pushf \n\t"
"pop %%eax\n\t"
"movl %%eax, %0\n\t"
:"=r"(b)
:
:"%eax"
);
if((b & 0x800) != (bold & 0x800)){
printf("register changed \n %x\t to\t %x\n",bold , b);
}
bold = b;
}
}
$ gcc -Wall -o ex2 ex2.c
$ ./ex2 # Works by sheer luck
0
register changed
200206 to 200a96
register changed
200a96 to 200282
$ gcc -Wall -O -o ex2 ex2.c
$ ./ex2 # Doesn't work, the compiler hasn't even optimized yet!
0
This maybe the case of the XY problem . To check for overflow you do not need to get the hardware overflow flag as you think because the flag can be calculated easily from the sign bits
An illustrative example is what happens if we add 127 and 127 using 8-bit registers. 127+127 is 254, but using 8-bit arithmetic the result would be 1111 1110 binary, which is -2 in two's complement, and thus negative. A negative result out of positive operands (or vice versa) is an overflow. The overflow flag would then be set so the program can be aware of the problem and mitigate this or signal an error. The overflow flag is thus set when the most significant bit (here considered the sign bit) is changed by adding two numbers with the same sign (or subtracting two numbers with opposite signs). Overflow never occurs when the sign of two addition operands are different (or the sign of two subtraction operands are the same).
Internally, the overflow flag is usually generated by an exclusive or of the internal carry into and out of the sign bit. As the sign bit is the same as the most significant bit of a number considered unsigned, the overflow flag is "meaningless" and normally ignored when unsigned numbers are added or subtracted.
So the C implementation is
int add(int a, int b, int* overflowed)
{
// do an unsigned addition since to prevent UB due to signed overflow
unsigned int r = (unsigned int)a + (unsigned int)b;
// if a and b have the same sign and the result's sign is different from a and b
// then the addition was overflowed
*overflowed = !!((~(a ^ b) & (a ^ r)) & 0x80000000);
return (int)r;
}
This way it works portably on any architectures, unlike your solution which only works on x86. Smart compilers may recognize the pattern and change to using the overflow flag if possible. On most RISC architectures like MIPS or RISC-V there is no flag and all signed/unsigned overflow must be checked in software by analyzing the sign bits like that
Some compilers have intrinsics for checking overflow like __builtin_add_overflow
in Clang and GCC . And with that intrinsic you can also easily see how the overflow is calculated on non-flag architectures. For example on ARM it's done like this
add w3, w0, w1 # r = a + b
eon w0, w0, w1 # a = a ^ ~b
eor w1, w3, w1 # b = b ^ r
str w3, [x2] # store sum ([x2] = r)
and w0, w1, w0 # a = a & b = (a ^ ~b) & (b ^ r)
lsr w0, w0, 31 # overflowed = a >> 31
ret
which is just a variation of what I've written above
See also
For unsigned int it's much easier
unsigned int a, b, result = a + b;
int overflowed = (result < a);
You can't assume anything about how GCC implemented the a++
operation, or whether it even did the computation before your inline asm, or before a function call.
You could make a
an (unused) input to your inline asm, but gcc could still have chosen to use lea
to copy-and-add instead of inc
or add
, or constant-propagation after inlining could have turned it into a mov
-immediate.
And of course gcc could have done some other computation that writes FLAGS right before your inline asm.
a++; asm(...)
a++; asm(...)
safe for this Stop now, you're on the wrong track. If you insist on using asm, you need to do the add
or inc
inside the asm so you can read the flags output. If you only care about the overflow flag, use SETCC, specifically seto %0
, to create an 8-bit output value. Or better, use GCC6 flag-output syntax to tell the compiler that a boolean output result is in the OF condition in FLAGS at the end of your inline asm.
Also, signed overflow in C is undefined behaviour, so actually causing overflow in a++
is already a bug. It usually won't manifest itself if you somehow detect it after the fact, but if you use a
as an array index or something gcc may have widened it to 64-bit to avoid redoing sign-extension.
There are builtins for signed/unsigned add, sub, and mul, see the GCC manual , that avoid signed-overflow UB and tell you if there was overflow.
bool __builtin_add_overflow (type1 a, type2 b, type3 *res)
is the generic version bool __builtin_sadd_overflow (int a, int b, int *res)
is the signed int
version bool __builtin_saddll_overflow (long long int a, long long int b, long long int *res)
is the signed 64-bit long long
version. The compiler will attempt to use hardware instructions to implement these built-in functions where possible, like conditional jump on overflow after addition, conditional jump on carry etc.
There's a saddl
version in case you want the operation for whatever size long
is on the target platform. (For x86-64 gcc, int
is always 32-bit, long long
is always 64-bit, but long
depends on Windows vs. non-Windows. For platforms like AVR, int
would be 16-bit, and only long
would be 32-bit.)
int checked_add_int(int a, int b, bool *of) {
int result;
*of = __builtin_sadd_overflow(a, b, &result);
return result;
}
compiles with gcc -O3
for x86-64 System V to this asm, on Godbolt
checked_add_int:
mov eax, edi
add eax, esi # can't use the normal lea eax, [rdi+rsi]
seto BYTE PTR [rdx]
and BYTE PTR [rdx], 1 # silly compiler, it's already 0/1
ret
ICC19 uses setcc
into an integer register and then stores that, same difference as far as uops, but worse code-size.
After inlining to a caller that did if(of) {}
it should just jo
or jno
instead of actually using setcc
to create an integer 0/1; in general this should inline efficiently.
Also, since gcc7, there's a builtin to ask if an addition (after promotion to a given type) would overflow, without returning the value.
#include <stdbool.h>
int overflows(int a, int b) {
bool of = __builtin_add_overflow_p(a, b, (int)0);
return of;
}
compiles with gcc -O3
for x86-64 System V to this asm, also on Godbolt
overflows:
xor eax, eax
add edi, esi
seto al
ret
See also Detecting signed overflow in C/C++
Others have offered good alternate code and reasons why what you're trying to do probably doesn't give the result you want, but the actual bug in your code is that you corrupted the stack state by pushing without popping. I would rewrite the asm as:
pushf
pop %0
Or you could just add $4,%%esp
at the end of your asm to fix the stack pointer if you prefer the inefficient way.
The following C program will read the FLAGS register when compiled with GCC and any x86 or x86_64 machine following a calling convention in which integers are returned to %eax
. You may need to pass the -zexecstack
argument to the compiler.
#include<stdio.h>
#include<stdlib.h>
int(*f)()=(void*)L"\xc3589c";
int main( int argc, char **argv ) {
if( argc < 3 ) {
printf( "Usage: %s <augend> <addend>\n", *argv );
return 0;
}
int a=atoi(argv[1])+atoi(argv[2]);
int b=f();
printf("%d CF %d PF %d AF %d ZF %d SF %d TF %d IF %d DF %d OF %d IOPL %d NT %d RF %d VM %d AC %d VIF %d VIP %d ID %d\n", a, b&1, b/4&1, b>>4&1, b>>6&1, b>>7&1, b>>8&1, b>>9&1, b>>10&1, b>>11&1, b>>12&3, b>>14&1, b>>16&1, b>>17&1, b>>18&1, b>>19&1, b>>20&1, b>>21&1 );
}
The funny looking string literal disassembles to
0x0000000000000000: 9C pushfq
0x0000000000000001: 58 pop rax
0x0000000000000002: C3 ret
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.