简体   繁体   中英

Map of vector of struct vs Vector of struct

I am making a small project program that involves inputting quotes that would be later saved into a database (in this case a .txt file). There are also commands that the user would input such as list (which shows the quote by author) and random (which displays a random quote).

Here's the structure if I would use a map (with the author string as the key):

 struct Information{
    string quoteContent;
    vector<string> tags;
 }

and here's the structure if I would use the vector instead:

 struct Information{
    string author;
    string quoteContent;
    vector<string> tags;
 }

note: The largest largest number of quotes I've had in the database is 200. (imported from a file)

I was just wondering which data structure would yield better performance. I'm still pretty new to this c++ thing, so any help would be appreciated!

For your data volumes it obviously doesn't matter from a performance perspective, but multi_map will likely let you write shorter, more comprehensible and maintainable code. Regarding general performance of vector vs maps (which is good to know about but likely only becomes relevant with millions of data elements or low-latency requirements)...

vector doesn't do any automatic sorting for you, so you'd probably push_back quotes as you read them, then do one std::sort once the data's loaded, after which you can find elements very quickly by author with std::binary_search or std::lower_bound , or identify insertion positions for new quotes using eg std::lower_bound , but if you want to insert a new quote thereafter you have to move the existing vector elements from that position on out of the way to make room - that's relatively slow. As you're just doing a few ad-hoc insertions based on user input, the time to do that with only a few hundred quotes in the vector will be totally insignificant. For the purposes of learning programming though, it's good to understand that a multimap is arranged as a kind of branching binary tree, with pointers linking the data elements, which allows for relatively quick insertion (and deletion). For some applications following all those pointers around can be more expensive (ie slower) than vector's contiguous memory (which works better with CPU cache memory), but in your case the data elements are all strings and vectors of strings that will likely (unless Short String Optimisations kick in) require jumping all over memory anyway.

In general, if author is naturally a key for your data just use a multi_map ... it'll do all your operations in reasonable time, maybe not the fastest but never particularly slow, unlike vector for post-data-population mid-container insertions (/deletions).

Depends on the purpose of usage. Both data-structures have their pros and cons.

Vectors

  • Position index at() or operator []
  • Find function not present You would have to use find algorithm func.

Maps:

  • Key can be searched
  • Position index is not applicable. Keys are stored

(use unordered map for better performance than map.)

Use datastructure on basis of what you want to achieve.

The golden rule is: "When in doubt, measure."
ie Write some tests, do some benchmarking.

Anyway, considering that you have circa 200 items, I don't think there should be an important difference from the two cases on modern PC hardware. Big-O notation matters when N is big (eg 10,000s, 100,000s, 1,000,000s, etc.)

vector tends to be simpler than map , and I'd use it as the default container of choice (unless your main goal is to access the items given the author's name as a key, in this case map seems more logically suited).

Another option might be to have a vector with items sorted using author's names, so you can use binary search (which is O(logN) ) inside the vector .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM