简体   繁体   中英

C++ correct data structure

I am looking for the correct data structure to store the number of characters of a given type in an incoming stream, letter by letter. I know the size of the alphabet in advance (about 10), but the stream will be about a 1gb. Main criteria is rapid access. Could use a list with an appropriately chosen enum to make things clearer, but is this the best way?

Given the performance requirements, consider a layout that is contiguous in memory ; thus helping reduce the cache misses.

Something like;

const std::size_t SIZE = 10;
int count[SIZE] = {};
// or
std::vector<int> count(SIZE, 0);

If you need to store the count together with the character, then a "pair" may help;

struct Datum {
  Datum() : c('\0'), count(0) {}
  char c; // assuming the "alphabet" is in the char range
  int count;
};

std::vector<Datum> count(SIZE);

Herb Sutter and Bjarne offer some material and empirical evidence as to why std::vector should be favoured. As always, measurements should be made to verify the performance given your data structure, algorithm and associated data accesses etc.

一个简单的数组最有效:

int counters[SIZE_OF_ALPHABET];

For storing you can try make alphabet encoding table and simple char array (char is enough to store 1 of 10 different characters). Like:

map<int, char> m;
m['A'] = 1;
m['B'] = 2;
...
char data[SIZE];

for(int i = 0; i < SIZE; i++){
    int ch = read();
    data[i] = m[ch];
}

Or compress 2 items into one char.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM