Writing binary data in c++

Question

I am in the process of building an assembler for a rather unusual machine that me and a few other people are building. This machine takes 18 bit instructions, and I am writing the assembler in C++.

I have collected all of the instructions into a vector of 32 bit unsigned integers, none of which is any larger than what can be represented with an 18 bit unsigned number.

However, there does not appear to be any way (as far as I can tell) to output such an unusual number of bits to a binary file in C++, can anyone help me with this.

(I would also be willing to use C's stdio and File structures. However there still does not appear to be any way to output such an arbitrary amount of bits).

Thank you for your help.

Edit: It looks like I didn't specify how the instructions will be stored in memory well enough.

Instructions are contiguous in memory. Say the instructions start at location 0 in memory:

The first instruction will be at 0. The second instruction will be at 18, the third instruction will be at 36, and so on.

There is no gaps, or no padding in the instructions. There can be a few superfluous 0s at the end of the program if needed.

The machine uses big endian instructions. So an instruction stored as 3 should map to: 000000000000000011

Answer 1

Keep an eight-bit accumulator.
Shift bits from the current instruction into to the accumulator until either:
- The accumulator is full; or
- No bits remain of the current instruction.
Whenever the accumulator is full:
- Write its contents to the file and clear it.
Whenever no bits remain of the current instruction:
- Move to the next instruction.
When no instructions remain:
- Shift zeros into the accumulator until it is full.
- Write its contents.
- End.

For n instructions, this will leave (8 - 18 n mod 8) zero bits after the last instruction.

Answer 2

You could maybe represent your data in a bitset and then write the bitset to a file. Wouldn't work with fstreams write function, but there is a way that is described here ...

Answer 3

There are a lot of ways you can achieve the same end result (I am assuming the end result is a tight packing of these 18 bits).

A simple method would be to create a bit-packer class that accepts the 32-bit words, and generates a buffer that packs the 18-bit words from each entry. The class would need to do some bit shifting, but I don't expect it to be particularly difficult. The last byte can have a few zero bits at the end if the original vector length is not a multiple of 4. Once you give all your words to this class, you can get a packed data buffer, and write it to a file.

Answer 4

The short answer: Your C++ program should output the 18-bit values in the format expected by your unusual machine.

We need more information, specifically, that format that your "unusual machine" expects, or more precisely, the format that your assembler should be outputting. Once you understand what the format of the output that you're generating is, the answer should be straightforward.

One possible format — I'm making things up here — is that we could take two of your 18-bit instructions:

         instruction 1       instruction 2     ...
       MSB            LSB  MSB            LSB  ...
bits → ABCDEFGHIJKLMNOPQR  abcdefghijklmnopqr  ...

...and write them in an 8-bits/byte file thus:

KLMNOPQR CDEFGHIJ 000000AB klmnopqr cdefghij 000000ab ...

...this is basically arranging the values in "little-endian" form, with 6 zero bits padding the 18-bit values out to 24 bits.

But I'm assuming: the padding, the little-endianness, the number of bits / byte, etc. Without more information, it's hard to say if this answer is even remotely near correct, or if it is exactly what you want.

Another possibility is a tight packing:

ABCDEFGH IJKLMNOP QRabcdef ghijklmn opqr0000

or

ABCDEFGH IJKLMNOP abcdefQR ghijklmn 0000opqr

...but I've made assumptions about where the corner cases go here.

Answer 5

Just output them to the file as 32 bit unsigned integers, just as you have in memory, with the endianness that you prefer.

And then, when the loader / eeprom writer / JTAG or whatever method you use to send the code to the machine, for each 32 bit word that is read, just omit the 14 more significant bits and send the real 18 bits to the target.

Unless, of course, you have written a FAT driver for your machine...

Writing binary data in c++

Question

5 answers

solution1
3 ACCPTED 2011-10-16 22:26:10

solution2
2 2011-10-16 22:12:10

solution3
2 2011-10-16 22:13:28

solution4
0 2011-10-16 22:23:23

solution5
0 2011-10-16 22:42:13

Writing binary data in c++

Question

5 answers

solution1 3 ACCPTED 2011-10-16 22:26:10

solution2 2 2011-10-16 22:12:10

solution3 2 2011-10-16 22:13:28

solution4 0 2011-10-16 22:23:23

solution5 0 2011-10-16 22:42:13

solution1
3 ACCPTED 2011-10-16 22:26:10

solution2
2 2011-10-16 22:12:10

solution3
2 2011-10-16 22:13:28

solution4
0 2011-10-16 22:23:23

solution5
0 2011-10-16 22:42:13