简体   繁体   English

比较std :: vector或std :: set的时间复杂度-更有效吗?

[英]Comparing std::vector or std::set in this case for time complexity - More efficent?

I currently have a function that returns strings. 我目前有一个返回字符串的函数。 I need to keep a track of these returned strings and if an action is not taken on a returned string then I have to take an action on it. 我需要跟踪这些返回的字符串,如果未对返回的字符串执行操作,则必须对其执行操作。

My first thought is using a vector (ie) std::vector. 我的第一个想法是使用向量(即)std :: vector。

Here is what a mechanism utilizing a vector would look like 这是利用向量的机制的样子

1-Check if item exists in a vector using std::find 1-使用std :: find检查矢量中是否存在项目

std::find(vector.begin(), vector.end(), item)!=vector.end()

2-If item not present do a push_back (Amortized constant) and perform an action on it else ignore the string 2-如果项目不存在,请执行push_back(摊余常量)并对其执行操作,否则忽略字符串

My second thought is using a std::set 我的第二个想法是使用std :: set

1-Check if item exists in set by doing the insert function if not insert it 1-如果不插入项目,则通过执行插入功能检查项目是否存在

 if(set.insert(somestring).second)
    {
      //Item inserted in set and it did not exist

    }

The time complexity of insert in set is O(logn) . set中插入的时间复杂度为O(logn) The push_back of vector is Amortized constant and if the vector isn't sorted(in this it isn't) std::find will be O(n). vector的push_back是Amortized常数,如果不对向量进行排序(在这种情况下不是),std :: find将为O(n)。 Is my assumption correct that for maximum efficiency i should be using a set here ? 我的假设正确吗,为了获得最大的效率,我应该在此处使用一个集合? Is there anything that I might be missing ? 有什么我可能会想念的吗?

I used to work on a foreign exchange pricing system in a bank. 我曾经在银行的外汇定价系统上工作。 Performance was of great interest to us. 表现令我们非常感兴趣。 We used to have long discussions on optimal algorithms... And then one day we measured the performance with a profiling tool.... and we found that the actual algorithms used up 5% of the processing time. 过去,我们对最佳算法进行了长时间的讨论……然后,有一天,我们使用性能分析工具测量了性能……。我们发现实际算法消耗了5%的处理时间。 The remaining 95% was taken up in converting strings to doubles and doubles to strings when the system received and sent messages to and from the message bus. 剩余的95%用于在系统接收消息总线或从消息总线发送消息时将字符串转换为双精度型和双精度转换为字符串。

Why do I write this? 我为什么要写这个? Just to illustrate that in almost all cases, the choice of your container is probably irrelevant. 仅为了说明这一点,在几乎所有情况下,容器的选择可能都不重要。 Your program is very unlikely to spend more than a fraction of its time finding items in maps, sets or vectors. 您的程序不太可能花费大量时间来查找地图,集合或向量中的项目。

Write the code in the most elegant and maintainable way you can, using easily understood algorithms, and containers that naturally fit the design (sets and maps for things that need to be ordered, vectors for general storage, unordered sets and maps if order isn't important and your data sets are huge). 使用容易理解的算法和自然适合设计的容器(最适合需要设计的容器和映射,用于常规存储的向量,无序集合和映射(如果需要的话),以最优雅,可维护的方式编写代码。非常重要,您的数据集也很大)。 If you need multiple ordered indexes on the same data then probably a vector for storage with sets of iterators/pointers for indexing (like a database). 如果您需要在同一数据上使用多个有序索引,则可能是一个向量,用于存储带有索引的迭代器/指针集(例如数据库)。

Then, when it's finished, if your users are screaming at you that it's too slow (they won't - they're more concerned about it working reliably), profile the code and measure for bottlenecks. 然后,当它完成时,如果您的用户大声疾呼您它太慢了(他们不会-他们更关心它是否可靠地工作),请配置代码并衡量瓶颈。 They will almost always be in the I/O. 它们几乎总是位于I / O中。

If in the incredibly unlikely scenario that your code is spending 90% of its time managing collections of data, then it's time to rethink the algorithm because the design is probably inefficient - or you're writing a protein folding simulator. 如果在极不可能的情况下您的代码花了90%的时间来管理数据收集,那么该是时候重新考虑算法了,因为设计可能效率低下-或者您正在编写蛋白质折叠模拟器。

If you're sure that the design is optimal, then maybe it's time to reconsider the type of the container. 如果您确定设计是最佳的,那么也许是时候重新考虑容器的类型了。

There are only fundamentally 3 types - you can find the best solution by trial and error in less time than it takes to argue about it. 从根本上讲,只有3种类型-您可以通过反复试验找到最佳解决方案,而无需花很多时间来争论它。

:-) :-)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM