Possible to convert std::vector of std::pairs into a byte array?

Question

I am wondering if it is possible to convert vector of pairs into a byte array.

Here's a small example of creating the vector of pairs:

int main(int argc, char *argv[])
{
    PBYTE FileData, FileData2, FileData3;
    DWORD FileSize, FileSize2, FileSize3;

    /* Here I read 3 files + their sizes and fill the above variables. */

    //Here I create the vector of std::pairs.
    std::vector<std::pair<PBYTE, DWORD>> DataVector
    {
        { FileData, FileSize }, //Pair contains always file data + file size.
        { FileData2, FileSize2 },
        { FileData3, FileSize3 }
    };

    std::cin.ignore(2);
    return 0;
}

Is it possible to convert this vector into a byte array (for compressing, and writing to disk, etc)?

Here is what I tried, but I didn't get even the size correctly:

PVOID DataVectorArr = NULL;

DWORD DataVectorArrSize = DataVector.size() * sizeof DataVector[0];

if ((DataVectorArr = malloc(DataVectorArrSize)) != NULL)
{
    memcpy(DataVectorArr, &DataVector[0], DataVectorArrSize);
}

std::cout << DataVectorArrSize;

//... Here I tried to write the DataVectorArr to disk, which obviously fails because the size isn't correct. I am not also sure if the DataVectorArr contains the DataVector now.

if (DataVectorArr != NULL) delete DataVectorArr;

Enough code. Is is it even possible, or am I doing it wrong? If I am doing it wrong, what would be the solution?

Regards, Okkaaj

Edit: If it's unclear what I am trying to do, read the following (which I commented earlier):

Yes, I am trying to cast the vector of pairs to a PCHAR or PBYTE - so I can store it to disk using WriteFile. After it is stored, I can read it from disk as byte array, and parse back to vector of pairs. Is this possible? I got the idea from converting / casting struct to a byte array and back(read more from here: Converting struct to byte and back to struct ) but I am not sure if this is possible with std::vector instead of structures.

Answer 1

Get rid of the malloc and make use of RAII for this:

std::vector<BYTE> bytes;
for (auto const& x : DataVector)
    bytes.insert(bytes.end(), x.first, x.first+x.second);

// bytes now contains all images buttressed end-to-end.
std::cout << bytes.size() << '\n';

To avoid potential resize slow-lanes, you can enumerate the size calculation first, then .reserve() the space ahead of time:

std::size_t total_len = 0;
for (auto const& x : DataVector)
    total_len += x.second;

std::vector<BYTE> bytes;
bytes.reserve(total_len);
for (auto const& x : DataVector)
    bytes.insert(bytes.end(), x.first, x.first+x.second);

// bytes now contains all images buttressed end-to-end.
std::cout << bytes.size() << '\n';

But if all you want to do is dump these contiguously to disk, then why not simply:

std::ofstream outp("outfile.bin", std::ios::out|std::ios::binary);
for (auto const& x : DataVector)
    outp.write(static_cast<const char*>(x.first), x.second);
outp.close();

skipping the middle man entirely.

And honestly, unless there is a good reason to do otherwise, it is highly likely your DataVector would be better off as simply a std::vector< std::vector<BYTE> > in the first place.

Update

If recovery is needed, you can't just do this as above. The minimal artifact that is missing is the description of the data itself. In this case the description is the actual length of each pair segment. To accomplish that the length must be stored along with the data. Doing that is trivial unless you also need it portable to platform-independence.

If that last sentence made you raise your brow, consider the problems with doing something as simple as this:

std::ofstream outp("outfile.bin", std::ios::out|std::ios::binary);
for (auto const& x : DataVector)
{
    uint64_t len = static_cast<uint64_t>(x.first);
    outp.write(reinterpret_cast<const char *>(&len), sizeof(len));
    outp.write(static_cast<const char*>(x.first), x.second);
}
outp.close();

Well, now you can read each file by doing this:

Read a uint64_t to obtain the byte length of the data to follow
Read the data of that length

But this has inherent problems. It isn't portable at all. The endian-representation of the reader's platform had better match that of the writer, or this is utter fail. To accommodate this limitation the length preamble must be written in a platform-independent manner, which is tedious and a foundational reason why serialization libraries and their protocols exit in the first place.

If you haven't second-guessed what you're doing and how you're doing it by this point, you may want to read this again.

Possible to convert std::vector of std::pairs into a byte array?

Question

1 answers

solution1
3 ACCPTED 2014-09-29 18:49:10

Possible to convert std::vector of std::pairs into a byte array?

Question

1 answers

solution1 3 ACCPTED 2014-09-29 18:49:10

solution1
3 ACCPTED 2014-09-29 18:49:10