I am expecting to be handling a huge number of data records, whereby around 20 uint8_t
keys will have millions of <int, struct>
pairs associated with each of them (ordered by int
). These pairs are rather lightweight at ~10 bytes, and need to be allocated dynamically.
Initially, I was using a std::map<uint8_t, std::vector<int, struct>>
but after studying the overhead associated with vectors, namely the capacity()
in
3 machine words in total +
sizeof(element)
*capacity()
as seen here ; capacity()
"typically has room for up to twice the actual number of elements" which is seemingly detrimental.
Instead of a vector, I could use a std::map, however the overhead of ~32 bytes per node also becomes very expensive for such light weight pairs.
I am unfamiliar with Boost and other C++ libraries, so was wondering whether anyone could advise on a solution where I could avoid manual dynamic memory allocation?
Edit : To clarify following a few questions in comments, the struct stored will contain 3 shorts (to start with), and no further data structures. I anticipate the length of the vector
to be no greater than 1.5*10^8, and understand this comes to ~1.4 GiB (thanks @dyp).
I suppose the question is rather, how to manage vector capacity()
such that reallocation through reserve()
is kept to a minimum. I am also unsure of the efficiency of shrink_to_fit()
(C++11)
Following up on @NielKirk's point about std::vector<> instead of a map for the keys, with only 256 possibilities you could also consider std::array<> (or even a C-style array) for the keys.
As for the std::pair<int, struct> elements, an initial implementation had them as members of a std::vector<std::pair<int, struct>> collection, and you said
Instead of a vector, I could use a std::map, however the overhead of ~32 bytes per node also becomes very expensive for such light weight pairs.
which implies the int
part of the element is unique as you did not mention std::multimap. You could take a look at Google sparsehash
( http://code.google.com/p/sparsehash/ ). From the project home page:
An extremely memory-efficient hash_map implementation. 2 bits/entry overhead! The SparseHash library contains several hash-map implementations, including implementations that optimize for space or speed.
These hashtable implementations are similar in API to SGI's hash_map class and the tr1 unordered_map class, but with different performance characteristics. It's easy to replace hash_map or unordered_map by sparse_hash_map or dense_hash_map in C++ code.
I've used it before, and never had a problem with it. Your uint8_t
keys could index into a (std::vector/std::array/C-array) collection KCH of hashmaps. If you wanted to you could even define KCH as collection of objects, each containing a hashmap, so each KCH[i] can implement a convenient interface for working with std::pair<int, struct>
objects for that key. You'd have a "bad key" element as a default for non-key elements in the collection referencing either a) a single empty dummy hashmap or b) a "bad key object" that handles an unexpected key value appropriately.
Something like this:
typedef std::pair<int, struct> myPair;
typedef google::sparse_hash_map<int, myPair> myCollectionType;
typedef google::sparse_hash_map<int, myPair>::iterator myCollectionIter;
myCollectionType dummyHashMap;
std:array<myCollectionType, 256> keyedArray;
Initialize all keyedArray
elements to dummyHashMap
, then fill in with different hash maps for valid keys.
Similarly, with containing objects:
class KeyedCollectionHandler {
public:
virtual bool whatever(parm);
...
private:
myCollectionType collection;
};
class BadKeyHandler : public KeyedCollectionHandler
{
public:
virtual bool whatever(parm){
// unknown or unexpected key, handle appropriately
}
...
};
BadKeyHandler badKeyHandler;
Initialize 256 keyed array elements to badKeyHandler
, fill in KeyedCollectionHandler
objects for good key values.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.