优化循环C ++

Question

I have this small function that loops through a vector to find a given animal and return its habitat. 我有一个小的功能，可以通过一个向量循环找到给定的动物并返回其栖息地。 I need to optimize it, but I'm hitting dead ends with what I could be doing. 我需要对其进行优化，但是我的工作陷入僵局。 Three things that popped out at me was size_t , i++ , and the fact I'm looping through a vector. 突然出现在我size_t三件事是size_t ， i++和我正在遍历向量的事实。 I read that size_t is great for arrays that have a large size index. 我读到size_t非常适合具有大尺寸索引的数组。 I know that pre-increment is better than post-increment since it changes the original value instead of creating a temp and incrementing. 我知道pre-increment比post-increment pre-increment更好，因为它会更改原始值，而不是创建临时和增量。 But compilers usually optimize this small difference anyway. 但是编译器通常无论如何都会优化这一小差异。 Finally, the last thing I thought was that this vector could be unsorted, therefore dipping in performance. 最后，我想到的最后一件事是该向量可能未排序，因此性能下降。 I was thinking of sorting the vector by the species variable of animals, then maybe porting it into a BST to search since the time complexity would be O(log(n)) . 我当时正在考虑根据动物的物种变量对向量进行排序，然后可能将其移植到BST进行搜索，因为时间复杂度为O(log(n)) 。 Here is the code I'm working with: 这是我正在使用的代码：

string GetAnimalHabitat(vector<Animal> animals, string species)
{
    for (size_t i = 0; i < animals.size(); i++)
    {
        if (animals[i].species == species)

        {
            return animals[i].habitat;
        }
    }
    return "Animal not within records.";
}

Is there anything that I'm potentially missing that could improve this function? 有什么我可能会缺少的东西可以改善此功能吗？ Any tips would be great. 任何提示都很好。 Thanks! 谢谢！

Answer 1

A std::map surely works better than looping over a vector here: 一个std::map肯定比在这里遍历一个向量更好：

string GetAnimalHabitat(const map<string, string>& animals, const string& species)
{
    auto search = animals.find(species);
    if (search != animals.end())
        return search->second;
    return string("Animal not within records.");
}

It requires building a map first, however. 但是，它需要先构建地图。 But building it once is enough and you only have to add new key pairs to it: 但是，一次构建就足够了，您只需向其添加新的密钥对：

map<string, string> build_map(const vector<Animal>& animals)
{
    map<string, string> ret;
    for (const auto& x : animals)
        ret[x.species] = x.habitat;
    return ret;
}

Answer 2

One thing you could do is reduce the number of times .size() method gets called, by doing something like this: 您可以做的一件事是通过执行以下操作来减少调用.size（）方法的次数：

size_t vectorSize = animals.size();
for (size_t i = 0; i < vectorSize; i++)
{
    if (animals[i].species == species)

    {
        return animals[i].habitat;
    }
}

Another minor one would be to change i++ to ++i . 另一个较小的问题是将i++更改为++i 。 The purpose of this is to avoid storing the value of i in a register every time it increments. 这样做的目的是避免每次i的值递增时都将其存储在寄存器中。

Answer 3

At a high level, you want to stop making a full copy of the vector and all its elements each time you call this function. 从高层次上讲，您希望在每次调用此函数时停止对向量及其所有元素进行完整复制。 If you're needing to optimize, I assume the vector is large, so why not pass a reference to a const vector? 如果需要优化，我假设向量很大，那么为什么不将引用传递给const向量呢？

Second, another problem is how many elements of the vector will match your input string? 其次，另一个问题是向量中有多少个元素将与您的输入字符串匹配？ If it's just a few, then scanning the whole vector is loading a lot of memory just to look at it to decide you don't need it. 如果只是少数几个，则扫描整个向量会加载大量内存，只是为了查看它来决定您是否不需要它。 Since you return after the first match, it's reasonable to think there is at most one of each species, and in that case an associative container is better. 由于您是在第一场比赛后返回的，因此可以合理地认为每个物种中最多有一个，并且在这种情况下，关联容器会更好。

Some latency sensitive places partition data into different groups so no "filtering" is necessary. 一些对延迟敏感的地方会将数据划分为不同的组，因此不需要“过滤”。 Just look at the group of things you care about and only process those. 只需查看您关心的一组事物，然后进行处理即可。

Another thing to consider, a string comparison is much slower than something like an integer comparison. 要考虑的另一件事是，字符串比较比整数比较慢得多。 You could pre-hash the species into the class, hash the species parameter before your loop, and compare the hashes. 您可以将种类预先哈希到类中，在循环之前对种类参数进行哈希，然后比较哈希值。 If they are equal, THEN compare the strings to make sure it's a real match. 如果它们相等，则比较字符串以确保它是真正的匹配。

But my guess is most of your time is spent copying your inputs and outputs. 但是我的猜测是，您大部分时间都花在复制输入和输出上。

Answer 4

std::unordered_map makes the most sense since order is not important std::unordered_map最有意义，因为顺序并不重要

string GetAnimalHabitat(const std::unordered_map<string, string>& animals, const string& species)
{
    auto search = animals.find(species);
    if (search != animals.end())
        return search->second;
    return string("Animal not within records.");
}

It requires building a map first, however. 但是，它需要先构建地图。 But building it once is enough and you only have to add new key pairs to it. 但是，一次构建就足够了，您只需向其添加新的密钥对。 Note that first time you pass a empty unordered_map in, after that you pass a vector of new values and the current map: 请注意，第一次传递空的unordered_map，然后传递新值的向量和当前映射：

build_map(const std::vector<Animal>& animals, std::unordered_map<string_string> * ret)
{
    for (const auto& x : animals)
        ret[x.species] = x.habitat;
}

优化循环C ++

问题描述

4 个解决方案

解决方案1
6 2017-12-14 00:03:44

解决方案2
1 2017-12-13 23:57:58

解决方案3
1 2017-12-14 00:14:24

解决方案4
0 2018-01-02 18:16:02

优化循环C ++

问题描述

4 个解决方案

解决方案1 6 2017-12-14 00:03:44

解决方案2 1 2017-12-13 23:57:58

解决方案3 1 2017-12-14 00:14:24

解决方案4 0 2018-01-02 18:16:02

解决方案1
6 2017-12-14 00:03:44

解决方案2
1 2017-12-13 23:57:58

解决方案3
1 2017-12-14 00:14:24

解决方案4
0 2018-01-02 18:16:02