[英]Best approach to find a key in more than 1 (multiple) 'std::map's or 'std::set's?
Will take std::map
example with minimal data. 将以带有最少数据的
std::map
示例为例。
I have 2 maps as below: 我有2张地图,如下所示:
map<string, Object*> map_ShortKey; // keys are single English words
map<string, Object*> map_LongKey; // keys are concatenated English words
The map_ShortKey
is populated at the beginning of the program with around 50 elements and remains constant throughout. map_ShortKey
在程序的开头填充了约50个元素,并且始终保持不变。 But the map_LongKey
continuously increases throughout the program and it may go upto 1000-10000 elements. 但是
map_LongKey
在整个程序中不断增加,可能会增加到1000-10000个元素。
When I want to search a word inside these maps what is the best approach ? 当我想在这些地图中搜索单词时,最好的方法是什么?
(1) Search first in map_ShortKey
, if not found then search in m_LongKey
. (1)首先在
map_ShortKey
搜索,如果找不到,则在m_LongKey
搜索。
(2) Add map_ShortKey
into m_LongKey
and then search (2)将
map_ShortKey
添加到m_LongKey
,然后搜索
Do you mean search a word, or search a key? 您是说搜索单词还是搜索关键字?
If map_LongKey
contains concatenated words, then searching for the first word of a concatenation will be unsuccessful. 如果
map_LongKey
包含串联的单词,则搜索串联的第一个单词将失败。
If you are searching for something that is actually a key in one of the maps however, then the answer to (2) depends on many things - more info needed. 但是,如果您要搜索的东西实际上是其中一张地图中的钥匙,那么(2)的答案取决于很多事情-需要更多信息。
If speed is your concern, then search first in whichever map is most likely to contain the key. 如果您关心速度,则首先在最有可能包含该钥匙的地图中搜索。
If speed is not your concern, then structure your code for clarity - whether this involves merging the maps together or otherwise will depend on your situation. 如果您不关心速度,那么请为清晰起见来组织代码-是否涉及将地图合并在一起,还是取决于您的情况。
It depends on the likelyhood of a successful find in map_Shortkey
- if it's quite likely, then you only spend 6 "steps" in this search [log2(n)], where a search in the map_LongKey
list averages 10-13 "steps". 这取决于在
map_Shortkey
成功查找的map_Shortkey
-如果很有可能,则在此搜索[log2(n)]中仅花费6个“步骤”,其中map_LongKey
列表中的搜索平均为10-13个“步骤”。
If, on the other hand, it's unlikely you will find the thing you are looking for in map_shortKey
, then the additional load on searching among another 50 elements in the large set isn't going to make much of a difference. 另一方面,如果不太可能在
map_shortKey
找到想要的东西,那么在大集合中的另外50个元素中进行搜索所带来的额外负担不会有太大的不同。
Since we don't know the statistics of success, it's hard to say which is the better approach. 由于我们不了解成功的统计数据,因此很难说哪种方法更好。
If you favor worst-case complexity and without knowing anything about your searches (eg the key is more likely to be found in one map than in the other), then I would go for approach 1). 如果您倾向于最坏情况下的复杂性,并且不了解搜索内容(例如,在一个映射中比在另一个映射中更可能找到密钥),那么我将采用方法1)。
Lookup in an std::map
has logarithmic worst-case complexity, so in the first case you will end up with a worst-case complexity of log(n) + log(m)
lookups (assuming your maps have n
and m
elements respectively). std::map
中的查找具有对数最坏情况的复杂度,因此在第一种情况下,您将遇到log(n) + log(m)
查找的最坏情况复杂度(假设您的地图分别具有n
和m
元素) )。 Thus, k
lookups will cost you k * (log(n) + log(m))
. 因此,
k
查找将花费k * (log(n) + log(m))
。
Insertion in a map also has logarithmic complexity, so in the second case you will force m
insertions from one map into the other and then a lookup in a map with m + n
elements. 在地图中插入也具有对数复杂度,因此在第二种情况下,您将迫使
m
插入从一个地图插入另一个地图,然后在具有m + n
元素的地图中查找。 Thus, for k
lookups (provided you are doing the insertion only the first time), you get m * log(n) + k * log(n + m)
worst-case complexity. 因此,对于
k
查找(假设您仅是第一次插入),您将获得m * log(n) + k * log(n + m)
最坏情况的复杂度。
Thus, if you care about worst-case complexity, approach 1) is preferable as long as: 因此,如果您担心最坏的情况下的复杂性,则只要满足以下条件,方法1)就更可取:
k * (log(n) + log(m)) < m * log(n) + k * log(n + m)
You can estimate k
based on your workload, n
and m
based on the size of the input, and do the math to figure out what is best (and then double-check this by measuring). 您可以根据工作量估算
k
根据输入的大小估算n
和m
,然后进行数学运算以找出最佳选择(然后通过测量进行仔细检查)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.