std :: unordered_map：多线程插入？

Question

I have a bunch of data (a giant list of integers between 0 and ULLONG_MAX) and I want to extract all unique values. 我有一堆数据（0和ULLONG_MAX之间的一个巨大的整数列表），我想提取所有唯一值。 My approach is to create an unordered_map, using the integer list values as the keys and a throwaway bool for the map values. 我的方法是创建一个unordered_map，使用整数列表值作为键，并为地图值使用一次性bool。 I iterate the list and insert throwaway values for each key. 我迭代列表并为每个键插入一次性值。 At the end I iterate the map to get all unique keys. 最后，我迭代地图以获得所有唯一键。 Pretty straight forward. 挺直的。

However, my list is so large (100s of millions) that I'd like to multithread this process. 但是，我的列表是如此之大（100亿），我想多线程这个过程。 I know a naive approach to threading won't work, because unordered_map insertions affect the underlying data structure and so it's not thread safe. 我知道一种天真的线程方法是行不通的，因为unordered_map插入会影响底层数据结构，所以它不是线程安全的。 And adding locks around each and every insertion will be slow and possibly negate any threading speedup. 并且在每次插入时添加锁定将是缓慢的并且可能否定任何线程加速。

However, presumably not every insertion changes the data structure (only ones that can't fit in the existing allocated buckets?). 但是，大概不是每次插入都会改变数据结构（只有那些不适合现有分配的存储桶？）。 Is there a way to check if an unordered_map re-allocation would be required for a particular insertion, prior to inserting? 有没有办法在插入之前检查特定插入是否需要unordered_map重新分配？ That way I could only lock threads when the map is getting changed, instead of locking during every insert. 这样我只能在地图变更时锁定线程，而不是在每次插入时锁定。 Then, prior to each insertion, the threads merely check if a lock exists...rather than doing a full lock/unlock. 然后，在每次插入之前，线程仅检查锁是否存在...而不是完全锁定/解锁。 Is that possible? 那可能吗？

Answer 1

The fundamental rule of parallelization break the job up, work on the pieces, and then combine the pieces. 并行化的基本规则打破了工作，处理碎片，然后组合碎片。

Hashing/item lookup is the most expensive part of the whole shebang, so that's what we'll focus on parallelizing. 散列/项目查找是整个shebang中最昂贵的部分，因此我们将专注于并行化。

If you absolutely need the result as a hash table, I got some bad news for you: you'll have to write your own. 如果您绝对需要将结果作为哈希表，我会收到一些坏消息：您必须自己编写。 That being said, let's begin. 话虽这么说，让我们开始吧。

First, let's solve the problem in serial. 首先，让我们连续解决问题。 this is simple. 这很简单。 The below function takes a vector and a callback. 以下函数采用向量和回调。 We're going to take the vector, convert it into an unordered_set , and give the unordered_set to the callback. 我们将获取向量，将其转换为unordered_set ，并将unordered_set给回调。 Simple? 简单？ Yes. 是。

Now, because we're going to be doing this on a thread, we can't do it right away. 现在，因为我们将在一个线程上执行此操作，所以我们无法立即执行此操作。 Instead, we'll return a lambda that takes no arguments. 相反，我们将返回一个不带参数的lambda。 When that lambda is invoked, that's when it'll create the unordered_set and give it to the callback. 当调用lambda时，就会创建unordered_set并将其提供给回调。 This way, we can give each lambda to it's own thread, and each thread will run the job by invoking the lambda. 这样，我们可以将每个lambda赋予它自己的线程，并且每个线程将通过调用lambda来运行该作业。

template<class Vector, class Callback>
auto lazyGetUnique(Vector& vector, Callback callback) {
    using Iterator = decltype(vector.begin());
    auto begin = vector.begin();
    auto end = vector.end();
    using elem_t = typename std::iterator_traits<Iterator>::value_type;

    //We capture begin, end, and callback
    return [begin, end, callback]() {
        callback(std::unordered_set<elem_t>(begin, end));
    };
}

Now - what should this callback do? 现在 - 这个回调应该做什么？ The answer is simple: the callback should assign the contents of the unordered_set to a vector. 答案很简单：回调应该将unordered_set的内容分配给一个向量。 Why? 为什么？ Because we're gonna be merging the results, and it's a lot faster to merge vectors than it is to merge unordered_set . 因为我们要合并结果，合并向量比合并unordered_set要快得多。

Let's write a function to give us the callback: 让我们编写一个函数来给我们回调：

template<class Vector>
auto assignTo(Vector& v) {
    return [&](auto&& contents) {
        v.assign(contents.begin(), contents.end());
    };
}

Suppose we want to get the unique elements of a vector, and assign them back to that vector. 假设我们想要获取向量的唯一元素，并将它们分配回该向量。 This is now really simple to do: 现在这很简单：

std::vector<int> v = /* stuff */;
auto new_thread = std::thread( lazyGetUnique(v, assignTo(v)) );

In this example, when new_thread finishes executing, v will contain only unique elements. 在此示例中，当new_thread完成执行时， v将仅包含唯一元素。

Let's look at the complete function to do everything. 让我们来看看完成所有功能的完整功能。

template<class Iterator>
auto getUnique(Iterator begin, Iterator end) {
    using elem_t = typename std::iterator_traits<Iterator>::value_type;

    std::vector<elem_t> blocks[4];

    //Split things up into blocks based on the last 4 bits
    //Of the number. This allows us to guarantee that no two blocks
    //share numbers. 
    for(; begin != end; ++begin) {
        auto val = *begin; 
        blocks[val & 0x3].push_back(val); 
    }

    //Each thread will run their portion of the problem.
    //Once it's found all unique elements, it'll stick the result in the block used as input
    auto thread_0 = std::thread( lazyGetUnique(blocks[0], assignTo(blocks[0])) );
    auto thread_1 = std::thread( lazyGetUnique(blocks[1], assignTo(blocks[1])) );
    auto thread_2 = std::thread( lazyGetUnique(blocks[2], assignTo(blocks[2])) );

    //We are thread_3, so we can just invoke it directly
    lazyGetUnique(blocks[3], assignTo(blocks[3]))(); //Here, we invoke it immediately

    //Join the other threads
    thread_0.join();
    thread_1.join();
    thread_2.join(); 

    std::vector<elem_t> result;
    result.reserve(blocks[0].size() + blocks[1].size() + blocks[2].size() + blocks[3].size());

    for(int i = 0; i < 4; ++i) {
        result.insert(result.end(), blocks[i].begin(), blocks[i].end());
    }

    return result;
}

This function breaks stuff up into 4 blocks, each of which are disjoint. 这个函数将东西分成4个块，每个块都是不相交的。 It finds the unique elements in each of the 4 blocks, then combines the result. 它在4个块中的每个块中找到唯一元素，然后组合结果。 The output is a vector. 输出是一个向量。

std :: unordered_map：多线程插入？

问题描述

1 个解决方案

解决方案1
3 已采纳 2018-12-16 07:59:25

std :: unordered_map：多线程插入？

问题描述

1 个解决方案

解决方案1 3 已采纳 2018-12-16 07:59:25

解决方案1
3 已采纳 2018-12-16 07:59:25