C ++：2个数组之间的差异

Question

I have two unsorted random access arrays of a single simple element type (int/string/etc, so has all comparison operators, can be hashed, etc.). 我有两个单个简单元素类型的未分类随机访问数组（int / string / etc，所以有所有比较运算符，可以进行哈希处理等）。 There should not be duplicate elements in either array. 任何一个数组中都不应该有重复的元素。

Looking for a general algorthim that given these arrays A and B will tell me: 寻找给出这些数组A和B的一般algorthim会告诉我：

What elements are in both A and B A和B中有哪些元素
What elements are in A but not B A中有哪些元素，但不是B.
What elements are in B but not A B中有哪些元素但不是A.

I guess I could do this with the set operators as below, but is there a faster solution (eg one that doesnt require me to build two sorted sets)? 我想我可以用下面的set运算符来做这个，但是有一个更快的解决方案（例如，不需要我构建两个有序集合）？

r1 = std::set_intersection(a,b);
r2 = std::set_difference(a,b);
r3 = std::set_difference(b,a);

Answer 1

Something like the following algorithm will run O(|A|+|B|) (assuming O(1) behavior from unordered_map ): 类似下面的算法会运行O（| A | + | B |）（假设来自unordered_map O（1）行为）：

Let list onlyA initially contain all of A, and lists onlyB and bothAB start out as empty. 让list onlyA最初包含所有A，并且只列出onlyB ， bothAB从空开始。
Let hash-table Amap associate elements in onlyA with its corresponding iterator in onlyA . 让哈希表Amap联营元素onlyA ，在其相应的迭代onlyA 。
For each element b in B 对于B每个元素b
- If b finds a corresponding iterator ai in Amap 如果b在Amap中找到相应的迭代器ai
  - Add b to bothAB 将b添加到bothAB
  - Remove b from onlyA using ai 使用ai从onlyA删除b
- Otherwise, add b to onlyB 否则，将b添加到onlyB

At the end of the above algorithm, 在上面的算法结束时，

onlyA contains elements in A but not in B, onlyA包含A中的元素但不包含B中的元素，
onlyB contains elements in B but not in A, onlyB包含B中的元素但不包含在A中，
bothAB contains elements in both A and B. 两个AB都包含A和B中的元素。

Below is an implementation of the above. 以下是上述的实现。 The result is returned as a tuple < onlyA , onlyB , bothAB >. 结果以元组< onlyA ， onlyB ， bothAB >的形式返回。

template <typename C>
auto venn_ify (const C &A, const C &B) ->
    std::tuple<
        std::list<typename C::value_type>,
        std::list<typename C::value_type>,
        std::list<typename C::value_type>
    >
{
    typedef typename C::value_type T;
    typedef std::list<T> LIST;
    LIST onlyA(A.begin(), A.end()), onlyB, bothAB;
    std::unordered_map<T, typename LIST::iterator> Amap(2*A.size());
    for (auto a = onlyA.begin(); a != onlyA.end(); ++a) Amap[*a] = a;
    for (auto b : B) {
        auto ai = Amap.find(b);
        if (ai == Amap.end()) onlyB.push_back(b);
        else {
            bothAB.push_back(b);
            onlyA.erase(ai->second);
        }
    }
    return std::make_tuple(onlyA, onlyB, bothAB);
}

Answer 2

First, it's not clear from your question whether you mean std::set when you speak of sorted sets. 首先，从你的问题来看，当你谈到排序集时，你的意思是std::set是不明确的。 If so, then your first reaction should be to use std::vector , if you can, on the original vectors. 如果是这样，那么你的第一反应应该是在原始向量上使用std::vector ，如果可以的话。 Just sort them, and then: 只需对它们进行排序，然后：

std::vector<T> r1;
std::set_intersection( a.cbegin(), a.cend(), b.cbegin(), b.cend(), std::back_inserter( r1 ) );

And the same for r2 and r3 . 对于r2和r3 。

Beyond that, I doubt that there's much you can do. 除此之外，我怀疑你能做多少事情。 Just one loop might improve things some: 只需一个循环可以改善一些事情：

std::sort( a.begin(), a.end() );
std::sort( b.begin(), b.end() );
onlyA.reserve( a.size() );
onlyB.reserve( b.size() );
both.reserve( std::min( a.size(), b.size() ) );
auto ita = a.cbegin();
auto enda = a.cend();
auto itb = b.cbegin();
auto endb = b.cend();
while ( ita != enda && itb != endb ) {
    if ( *ita < *itb ) {
        onlyA.push_back( *ita );
        ++ ita;
    } else if ( *itb < *ita ) {
        onlyB.push_back( *itb );
        ++ itb;
    } else {
        both.push_back( *ita );
        ++ ita;
        ++ itb;
    }
}
onlyA.insert( onlyA.end(), ita, enda );
onlyB.insert( onlyB.end(), itb, endb );

The reserve could make a difference, and unless most of the elements end up in the same vector, probably won't cost much extra memory. reserve可以产生影响，除非大多数元素最终都在同一个向量中，否则可能不会花费太多额外的内存。

Answer 3

You can do this in linear time by putting the elements of A into an unordered_map where the elements from A are the keys. 您可以通过将A的元素放入unordered_map（其中A中的元素是键）来以线性时间执行此操作。 The check if the elements of B in keys in the map. 检查地图中键中B的元素是否存在。

C ++：2个数组之间的差异

问题描述

3 个解决方案

解决方案1
3 2014-08-21 16:10:09

解决方案2
3 已采纳 2014-08-21 17:16:15

解决方案3
-1 2014-08-21 15:32:37

C ++：2个数组之间的差异

问题描述

3 个解决方案

解决方案1 3 2014-08-21 16:10:09

解决方案2 3 已采纳 2014-08-21 17:16:15

解决方案3 -1 2014-08-21 15:32:37

解决方案1
3 2014-08-21 16:10:09

解决方案2
3 已采纳 2014-08-21 17:16:15

解决方案3
-1 2014-08-21 15:32:37