[英]C++: Differences between 2 arrays
I have two unsorted random access arrays of a single simple element type (int/string/etc, so has all comparison operators, can be hashed, etc.). 我有两个单个简单元素类型的未分类随机访问数组(int / string / etc,所以有所有比较运算符,可以进行哈希处理等)。 There should not be duplicate elements in either array.
任何一个数组中都不应该有重复的元素。
Looking for a general algorthim that given these arrays A and B will tell me: 寻找给出这些数组A和B的一般algorthim会告诉我:
I guess I could do this with the set operators as below, but is there a faster solution (eg one that doesnt require me to build two sorted sets)? 我想我可以用下面的set运算符来做这个,但是有一个更快的解决方案(例如,不需要我构建两个有序集合)?
r1 = std::set_intersection(a,b);
r2 = std::set_difference(a,b);
r3 = std::set_difference(b,a);
Something like the following algorithm will run O(|A|+|B|) (assuming O(1) behavior from unordered_map
): 类似下面的算法会运行O(| A | + | B |)(假设来自
unordered_map
O(1)行为):
onlyA
initially contain all of A, and lists onlyB
and bothAB
start out as empty. onlyA
最初包含所有A,并且只列出onlyB
, bothAB
从空开始。 Amap
associate elements in onlyA
with its corresponding iterator in onlyA
. Amap
联营元素onlyA
,在其相应的迭代onlyA
。 B
B
每个元素b
bothAB
bothAB
onlyA
using ai onlyA
删除b onlyB
onlyB
At the end of the above algorithm, 在上面的算法结束时,
Below is an implementation of the above. 以下是上述的实现。 The result is returned as a tuple <
onlyA
, onlyB
, bothAB
>. 结果以元组<
onlyA
, onlyB
, bothAB
>的形式返回。
template <typename C>
auto venn_ify (const C &A, const C &B) ->
std::tuple<
std::list<typename C::value_type>,
std::list<typename C::value_type>,
std::list<typename C::value_type>
>
{
typedef typename C::value_type T;
typedef std::list<T> LIST;
LIST onlyA(A.begin(), A.end()), onlyB, bothAB;
std::unordered_map<T, typename LIST::iterator> Amap(2*A.size());
for (auto a = onlyA.begin(); a != onlyA.end(); ++a) Amap[*a] = a;
for (auto b : B) {
auto ai = Amap.find(b);
if (ai == Amap.end()) onlyB.push_back(b);
else {
bothAB.push_back(b);
onlyA.erase(ai->second);
}
}
return std::make_tuple(onlyA, onlyB, bothAB);
}
First, it's not clear from your question whether you mean std::set
when you speak of sorted sets. 首先,从你的问题来看,当你谈到排序集时,你的意思是
std::set
是不明确的。 If so, then your first reaction should be to use std::vector
, if you can, on the original vectors. 如果是这样,那么你的第一反应应该是在原始向量上使用
std::vector
,如果可以的话。 Just sort them, and then: 只需对它们进行排序,然后:
std::vector<T> r1;
std::set_intersection( a.cbegin(), a.cend(), b.cbegin(), b.cend(), std::back_inserter( r1 ) );
And the same for r2
and r3
. 对于
r2
和r3
。
Beyond that, I doubt that there's much you can do. 除此之外,我怀疑你能做多少事情。 Just one loop might improve things some:
只需一个循环可以改善一些事情:
std::sort( a.begin(), a.end() );
std::sort( b.begin(), b.end() );
onlyA.reserve( a.size() );
onlyB.reserve( b.size() );
both.reserve( std::min( a.size(), b.size() ) );
auto ita = a.cbegin();
auto enda = a.cend();
auto itb = b.cbegin();
auto endb = b.cend();
while ( ita != enda && itb != endb ) {
if ( *ita < *itb ) {
onlyA.push_back( *ita );
++ ita;
} else if ( *itb < *ita ) {
onlyB.push_back( *itb );
++ itb;
} else {
both.push_back( *ita );
++ ita;
++ itb;
}
}
onlyA.insert( onlyA.end(), ita, enda );
onlyB.insert( onlyB.end(), itb, endb );
The reserve
could make a difference, and unless most of the elements end up in the same vector, probably won't cost much extra memory. reserve
可以产生影响,除非大多数元素最终都在同一个向量中,否则可能不会花费太多额外的内存。
You can do this in linear time by putting the elements of A into an unordered_map where the elements from A are the keys. 您可以通过将A的元素放入unordered_map(其中A中的元素是键)来以线性时间执行此操作。 The check if the elements of B in keys in the map.
检查地图中键中B的元素是否存在。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.