简体   繁体   English

在超过1(多个)“ std :: map”或“ std :: set”中找到密钥的最佳方法?

[英]Best approach to find a key in more than 1 (multiple) 'std::map's or 'std::set's?

Will take std::map example with minimal data. 将以带有最少数据的std::map示例为例。
I have 2 maps as below: 我有2张地图,如下所示:

map<string, Object*> map_ShortKey; // keys are single English words
map<string, Object*> map_LongKey; // keys are concatenated English words

The map_ShortKey is populated at the beginning of the program with around 50 elements and remains constant throughout. map_ShortKey在程序的开头填充了约50个元素,并且始终保持不变。 But the map_LongKey continuously increases throughout the program and it may go upto 1000-10000 elements. 但是map_LongKey在整个程序中不断增加,可能会增加到1000-10000个元素。

When I want to search a word inside these maps what is the best approach ? 当我想在这些地图中搜索单词时,最好的方法是什么?

(1) Search first in map_ShortKey , if not found then search in m_LongKey . (1)首先在map_ShortKey搜索,如果找不到,则在m_LongKey搜索。
(2) Add map_ShortKey into m_LongKey and then search (2)将map_ShortKey添加到m_LongKey ,然后搜索

Do you mean search a word, or search a key? 您是说搜索单词还是搜索关键字?

If map_LongKey contains concatenated words, then searching for the first word of a concatenation will be unsuccessful. 如果map_LongKey包含串联的单词,则搜索串联的第一个单词将失败。

If you are searching for something that is actually a key in one of the maps however, then the answer to (2) depends on many things - more info needed. 但是,如果您要搜索的东西实际上是其中一张地图中的钥匙,那么(2)的答案取决于很多事情-需要更多信息。

If speed is your concern, then search first in whichever map is most likely to contain the key. 如果您关心速度,则首先在最有可能包含该钥匙的地图中搜索。

If speed is not your concern, then structure your code for clarity - whether this involves merging the maps together or otherwise will depend on your situation. 如果您不关心速度,那么请为清晰起见来组织代码-是否涉及将地图合并在一起,还是取决于您的情况。

It depends on the likelyhood of a successful find in map_Shortkey - if it's quite likely, then you only spend 6 "steps" in this search [log2(n)], where a search in the map_LongKey list averages 10-13 "steps". 这取决于在map_Shortkey成功查找的map_Shortkey -如果很有可能,则在此搜索[log2(n)]中仅花费6个“步骤”,其中map_LongKey列表中的搜索平均为10-13个“步骤”。

If, on the other hand, it's unlikely you will find the thing you are looking for in map_shortKey , then the additional load on searching among another 50 elements in the large set isn't going to make much of a difference. 另一方面,如果不太可能在map_shortKey找到想要的东西,那么在大集合中的另外50个元素中进行搜索所带来的额外负担不会有太大的不同。

Since we don't know the statistics of success, it's hard to say which is the better approach. 由于我们不了解成功的统计数据,因此很难说哪种方法更好。

If you favor worst-case complexity and without knowing anything about your searches (eg the key is more likely to be found in one map than in the other), then I would go for approach 1). 如果您倾向于最坏情况下的复杂性,并且不了解搜索内容(例如,在一个映射中比在另一个映射中更可能找到密钥),那么我将采用方法1)。

Lookup in an std::map has logarithmic worst-case complexity, so in the first case you will end up with a worst-case complexity of log(n) + log(m) lookups (assuming your maps have n and m elements respectively). std::map中的查找具有对数最坏情况的复杂度,因此在第一种情况下,您将遇到log(n) + log(m)查找的最坏情况复杂度(假设您的地图分别具有nm元素) )。 Thus, k lookups will cost you k * (log(n) + log(m)) . 因此, k查找将花费k * (log(n) + log(m))

Insertion in a map also has logarithmic complexity, so in the second case you will force m insertions from one map into the other and then a lookup in a map with m + n elements. 在地图中插入也具有对数复杂度,因此在第二种情况下,您将迫使m插入从一个地图插入另一个地图,然后在具有m + n元素的地图中查找。 Thus, for k lookups (provided you are doing the insertion only the first time), you get m * log(n) + k * log(n + m) worst-case complexity. 因此,对于k查找(假设您仅是第一次插入),您将获得m * log(n) + k * log(n + m)最坏情况的复杂度。

Thus, if you care about worst-case complexity, approach 1) is preferable as long as: 因此,如果您担心最坏的情况下的复杂性,则只要满足以下条件,方法1)就更可取:

k * (log(n) + log(m)) < m * log(n) + k * log(n + m) 

You can estimate k based on your workload, n and m based on the size of the input, and do the math to figure out what is best (and then double-check this by measuring). 您可以根据工作量估算k根据输入的大小估算nm ,然后进行数学运算以找出最佳选择(然后通过测量进行仔细检查)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM