简体   繁体   English

为什么/unordered_map 和 unordered_set 更慢?

[英]Why/Are unordered_map and unordered_set slower?

I was solving a simple problem of finding unique elements in an array.我正在解决一个在数组中查找唯一元素的简单问题。 I used a std::unordered_map to count unique elements but it gave Time Limit Exceeded in one test case.我使用std::unordered_map来计算唯一元素,但它在一个测试用例中给出了Time Limit Exceeded Then I used a std::unordered_set with the same result.然后我使用了具有相同结果的std::unordered_set

PS: I used std::unordered_map and std::unordered_set because I read that these two are much much faster than std::map and std::set respectively. PS:我使用了std::unordered_mapstd::unordered_set因为我读到这两个比std::mapstd::set快得多。

#include<bits/stdc++.h>
using namespace std;

int main() {
    int n, a;
    cin >> n;
    unordered_set<int> s;

    for(int i = 0; i < n; i++) {
        cin >> a;
        s.insert(a);
    }
    cout << s.size();
}

Test 7 exceeded the time limit.测试 7 超过了时间限制。

My Question is:我的问题是:

If std::unordered_map and std::unordered_set are faster why did they give TLE?如果std::unordered_mapstd::unordered_set更快,为什么他们给 TLE?

std::unordered_set<int> is a node-based container, where each element is stored in separately allocated list node. std::unordered_set<int>是一个基于节点的容器,其中每个元素都存储在单独分配的列表节点中。 The list node contains an int element and two list node pointers, which, on a 64-bit platform, makes each list node occupy 24 bytes due to alignment.列表节点包含一个int元素和两个列表节点指针,在 64 位平台上,由于 alignment,使得每个列表节点占用 24 个字节。 There is also allocator overhead for each allocated chunk of memory (8 bytes for GNU libc), so that there is at least 28 bytes of overhead for each 4-byte int element. memory(GNU libc 为 8 个字节)的每个已分配块也有分配器开销,因此每个 4 字节int元素至少有 28 个字节的开销。

s.insert(a); makes a memory allocation for each new element and that is what makes the code slow.为每个新元素分配 memory,这就是使代码变慢的原因。


To solve this problem efficiently you can use a bitset indexed by the integers, eg std::vector<bool> .为了有效地解决这个问题,您可以使用由整数索引的位集,例如std::vector<bool> Set the bit to 1 for each read integer and then count the number of set bits.每次读取 integer 时将该位设置为 1,然后计算设置位的数量。 If the elements are signed, covert them to unsigned to make the bit index a non-negative number.如果元素是有符号的,则将它们转换为无符号以使位索引为非负数。

A working example:一个工作示例:

#include <vector>
#include <iostream>
#include <numeric>
#include <limits>

int main() {
    int n;
    std::cin >> n;
    std::vector<bool> bitset(1000000001); // a range is 1≤a≤10^9.

    for(int i = 0, a; i < n; ++i) {
        std::cin >> a;
        bitset[static_cast<unsigned>(a)] = true;
    }
    std::cout << std::accumulate(bitset.begin(), bitset.end(), 0u) << '\n';
}

A version that passes that grader:通过该分级机的版本:

#include <bitset>
#include <iostream>
#include <numeric>
#include <limits>

int main() {
    int n;
    std::cin >> n;
    std::bitset<1000000001> bitset; // a range is 1≤a≤10^9.
    unsigned unique = 0;
    for(int i = 0, a; i < n; ++i) {
        std::cin >> a;
        if(!bitset.test(a)) {
            ++unique;
            bitset.set(a);
        }
    }
    std::cout << unique << '\n';
}

std::unordered_map gives result in o(1) most of the time, not always. std::unordered_map大多数时候都会给出 o(1) 的结果,但并非总是如此。 Coming to the TLE issue, there might be a possibility that constraints are more than 10^18 and in this case o(n) complexity will not work.谈到 TLE 问题,约束可能超过 10^18,在这种情况下,o(n) 复杂度将不起作用。 Try o(log(n)) approach.尝试 o(log(n)) 方法。

The unordered containers are optimized for retrieving values, not for inserting them.无序容器针对检索值进行了优化,而不是针对插入它们。

You could take a look at this comparison https://medium.com/@nonuruzun/stl-container-performance-3ec5a8fbc3be .你可以看看这个比较https://medium.com/@nonuruzun/stl-container-performance-3ec5a8fbc3be There, you can see that the unordered containers have the O(N) worst case for inserting while the ordered ones have O(log(N))在那里,您可以看到无序容器的插入最坏情况为 O(N),而有序容器的插入最坏情况为 O(log(N))

But you can fix that by allocating memory first, so you will have less collisions.但是你可以通过先分配 memory 来解决这个问题,这样你就会有更少的冲突。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM