Reading CF, PF, ZF, SF, OF

Question

I am writing a virtual machine for my own assembly language, I want to be able to set the carry, parity, zero, sign and overflowflags as they are set in the x86-64 architecture, when I perform operations such as addition.

Notes:

I am using Microsoft Visual C++ 2015 & Intel C++ Compiler 16.0
I am compiling as a Win64 application.
My virtual machine (currently) only does arithmetic on 8-bit integers
I'm not (currently) interested in any other flags (eg AF)

My current solution is using the following function:

void update_flags(uint16_t input)
{
    Registers::flags.carry = (input > UINT8_MAX);
    Registers::flags.zero = (input == 0);
    Registers::flags.sign = (input < 0);
    Registers::flags.overflow = (int16_t(input) > INT8_MAX || int16_t(input) < INT8_MIN);

    // I am assuming that overflow is handled by trunctation
    uint8_t input8 = uint8_t(input);
    // The parity flag
    int ones = 0;
    for (int i = 0; i < 8; ++i)
        if (input8 & (1 << i) != 0) ++ones;

    Registers::flags.parity = (ones % 2 == 0);
}

Which for addition, I would use as follows:

uint8_t a, b;
update_flags(uint16_t(a) + uint16_t(b));
uint8_t c = a + b;

EDIT: To clarify, I want to know if there is a more efficient/neat way of doing this (such as by accessing RFLAGS directly) Also my code may not work for other operations (eg multiplication)

EDIT 2 I have updated my code now to this:

void update_flags(uint32_t result)
{
    Registers::flags.carry = (result > UINT8_MAX);
    Registers::flags.zero = (result == 0);
    Registers::flags.sign = (int32_t(result) < 0);
    Registers::flags.overflow = (int32_t(result) > INT8_MAX || int32_t(result) < INT8_MIN);
    Registers::flags.parity = (_mm_popcnt_u32(uint8_t(result)) % 2 == 0);
}

One more question, will my code for the carry flag work properly?, I also want it to be set correctly for "borrows" that occur during subtraction.

Note: The assembly language I am virtualising is of my own design, meant to be simple and based of Intel's implementation of x86-64 (ie Intel64), and so I would like these flags to behave in mostly the same way.

Answer 1

TL:DR : use lazy flag evaluation, see below.

input is a weird name. Most ISAs update flags based on the result of an operation, not the inputs. You're looking at the 16bit result of an 8bit operation, which is an interesting approach. In the C, you should just use unsigned int , which is guaranteed to be at least uint16_t . It will compile to better code on x86, where unsigned is 32bit. 16bit ops take an extra prefix and can lead to partial-register slowdowns.

That might help with the 8bx8b->16b mul problem you noted, depending on how you want to define the flag-updating for the mul instruction in the architecture you're emulating.

I don't think your overflow detection is correct. See this tutorial linked from the x86 tag wiki for how it's done.

This will probably not compile to very fast code, especially the parity flag. Do you need the ISA you're emulating/designing to have a parity flag? You never said you're emulating an x86, so I assume it's some toy architecture you're designing yourself.

An efficient emulator (esp. one that needs to support a parity flag) would probably benefit a lot from some kind of lazy flag evaluation . Save a value that you can compute flags from if needed, but don't actually compute anything until you get to an instruction that reads flags. Most instructions only write flags without reading them, and they just save the uint16_t result into your architectural state. Flag-reading instructions can either compute just the flag they need from that saved uint16_t , or compute all of them and store that somehow.

Assuming you can't get the compiler to actually read PF from the result, you might try _mm_popcnt_u32((uint8_t)x) & 1 . Or, horizontally XOR all the bits together:

x  = (x&0b00001111) ^ (x>>4)
x  = (x&0b00000011) ^ (x>>2)
PF = (x&0b00000001) ^ (x>>1)   // tweaking this to produce better asm is probably possible

I doubt any of the major compilers can peephole-optimize a bunch of checks on a result into LAHF + SETO al , or a PUSHF . Compilers can be led into using a flag condition to detect integer overflow to implement saturating addition, for example . But having it figure out that you want all the flags, and actually use LAHF instead of a series of setcc instruction, is probably not possible. The compiler would need a pattern-recognizer for when it can use LAHF , and probably nobody's implemented that because the use-cases are so vanishingly rare.

There's no C/C++ way to directly access flag results of an operation, which makes C a poor choice for implementing something like this. IDK if any other languages do have flag results, other than asm.

I expect you could gain a lot of performance by writing parts of the emulation in asm, but that would be platform-specific. More importantly, it's a lot more work.

Answer 2

I appear to have solved the problem, by splitting the arguments to update flags into an unsigned and signed result as follows:

void update_flags(int16_t unsigned_result, int16_t signed_result)
{
    Registers::flags.zero = unsigned_result == 0;
    Registers::flags.sign = signed_result < 0;
    Registers::flags.carry = unsigned_result < 0 || unsigned_result > UINT8_MAX;
    Registers::flags.overflow = signed_result < INT8_MIN || signed_result > INT8_MAX
}

For addition (which should produce the correct result for both signed & unsigned inputs) I would do the following:

int8_t a, b;
int16_t signed_result = int16_t(a) + int16_t(b);
int16_t unsigned_result = int16_t(uint8_t(a)) + int16_t(uint8_t(b));
update_flags(unsigned_result, signed_result);
int8_t c = a + b;

And signed multiplication I would do the following:

int8_t a, b;
int16_t result = int16_t(a) * int16_t(b);
update_flags(result, result);
int8_t c = a * b;

And so on for the other operations that update the flags

Note: I am assuming here that int16_t(a) sign extends, and int16_t(uint8_t(a)) zero extends.

I have also decided against having a parity flag, my _mm_popcnt_u32 solution should work if I change my mind later..

PS Thank you to everyone who responded, it was very helpful. Also if anyone can spot any mistakes in my code, that would be appreciated.

Reading CF, PF, ZF, SF, OF

Question

2 answers

solution1
1 2016-03-26 06:32:56

solution2
0 ACCPTED 2016-03-26 12:48:05

Reading CF, PF, ZF, SF, OF

Question

2 answers

solution1 1 2016-03-26 06:32:56

solution2 0 ACCPTED 2016-03-26 12:48:05

solution1
1 2016-03-26 06:32:56

solution2
0 ACCPTED 2016-03-26 12:48:05