简体   繁体   English

std::merge 和 std::set_union 有什么区别?

[英]What's the difference between std::merge and std::set_union?

The question is clear, my google- and cplusplus.com/reference-fu is failing me.问题很明确,我的 google- 和 cplusplus.com/reference-fu 让我失望了。

std::set_union will contain those elements that are present in both sets only once. std::set_union将包含那些只出现在两个集合中一次的元素。 std::merge will contain them twice. std::merge将包含它们两次。

For example, with A = {1, 2, 5}; B = {2, 3, 4}例如, A = {1, 2, 5}; B = {2, 3, 4} A = {1, 2, 5}; B = {2, 3, 4} : A = {1, 2, 5}; B = {2, 3, 4}

  • union will give C = {1, 2, 3, 4, 5} union 将给出C = {1, 2, 3, 4, 5}
  • merge will give D = {1, 2, 2, 3, 4, 5}合并将给出D = {1, 2, 2, 3, 4, 5}

Both work on sorted ranges, and return a sorted result.两者都适用于已排序的范围,并返回已排序的结果。

Short example:简短示例:

#include <algorithm>
#include <iostream>
#include <set>
#include <vector>

int main()
{
  std::set<int> A = {1, 2, 5};
  std::set<int> B = {2, 3, 4};

  std::vector<int> out;
  std::set_union(std::begin(A), std::end(A), std::begin(B), std::end(B),
                 std::back_inserter(out));
  for (auto i : out)
  {
    std::cout << i << " ";
  }
  std::cout << '\n';

  out.clear();
  std::merge(std::begin(A), std::end(A), std::begin(B), std::end(B),
             std::back_inserter(out));
  for (auto i : out)
  {
    std::cout << i << " ";
  }
  std::cout << '\n';
}

Output:输出:

1 2 3 4 5 
1 2 2 3 4 5

std::merge keeps all elements from both ranges, equivalent elements from the first range preceding equivalent elements from the second range in the output. std::merge保留来自两个范围的所有元素,来自第一个范围的等效元素在输出中来自第二个范围的等效元素之前。 Where an equivalent elements appear in both ranges std::set_union takes only the element from the first range, otherwise each element is merged in order as with std::merge .如果等效元素出现在两个范围内, std::set_union仅采用第一个范围内的元素,否则每个元素按顺序std::merge

References: ISO/IEC 14882:2003 25.3.4 [lib.alg.merge] and 25.3.5.2 [lib.set.union].参考文献:ISO/IEC 14882:2003 25.3.4 [lib.alg.merge] 和 25.3.5.2 [lib.set.union]。

This is the verification I suggested in the comment I posted to the accepted answer (ie that if an element is present in one of the input-sets N times, it will appear N times in the output of set_union - so set_union does not remove duplicate equivalent items in the way we would 'naturally' or 'mathematically' expect - if, however, both input-ranges contained a common item once only, then set_union would appear to remove the duplicate)这是我在发布到已接受答案的评论中建议的验证(即,如果一个元素在其中一个输入集中出现 N 次,它将在 set_union 的输出中出现 N 次 - 所以 set_union不会删除重复项以我们“自然地”或“数学上”期望的方式等价的项目——但是,如果两个输入范围只包含一个公共项目一次,那么 set_union似乎会删除重复项)

#include <vector>
#include <algorithm>
#include <iostream>
#include <cassert>

using namespace std;

void printer(int i) { cout << i << ", "; }

int main() {
    int mynumbers1[] = { 0, 1, 2, 3, 3, 4 }; // this is sorted, 3 is dupe
    int mynumbers2[] = { 5 };                // this is sorted


    vector<int> union_result(10);
    set_union(mynumbers1, mynumbers1 + sizeof(mynumbers1)/sizeof(int),
              mynumbers2, mynumbers2 + sizeof(mynumbers2)/sizeof(int),
              union_result.begin());
    for_each(union_result.begin(), union_result.end(), printer);

    return 0;
}

This will print: 0, 1, 2, 3, 3, 4, 5, 0, 0, 0,这将打印:0、1、2、3、3、4、5、0、0、0、

To add to the previous answers - beware that the complexity of std::set_union is twice that of std::merge .要添加到以前的答案 - 请注意std::set_union的复杂性是 std:: std::merge的两倍。 In practise, this means the comparator in std::set_union may be applied to an element after it has been dereferenced, while with std::merge this is never the case.实际上,这意味着std::set_union中的比较器可以在元素被取消引用应用于它,而对于std::merge则永远不会出现这种情况。

Why may this be important?为什么这很重要? Consider something like:考虑这样的事情:

std::vector<Foo> lhs, rhs;

And you want to produce a union of lhs and rhs :并且您想生成lhsrhs的联合:

std::set_union(std::cbegin(lhs), std::cend(lhs),
               std::cbegin(rhs), std::cend(rhs),
               std::back_inserter(union));

But now suppose Foo is not copyable, or is very expensive to copy and you don't need the originals.但是现在假设Foo不可复制,或者复制成本非常高,并且您不需要原件。 You may think to use:您可能会考虑使用:

std::set_union(std::make_move_iterator(std::begin(lhs)),
               std::make_move_iterator(std::end(lhs)),
               std::make_move_iterator(std::begin(rhs)),
               std::make_move_iterator(std::end(rhs)),
               std::back_inserter(union));

But this is undefined behaviour as there is a possibility of a moved Foo being compared: The correct solution is therefore:但这是未定义的行为,因为有可能比较移动的Foo :因此,正确的解决方案是:

std::merge(std::make_move_iterator(std::begin(lhs)),
           std::make_move_iterator(std::end(lhs)),
           std::make_move_iterator(std::begin(rhs)),
           std::make_move_iterator(std::end(rhs)),
           std::back_inserter(union));
union.erase(std::unique(std::begin(union), std::end(union), std::end(union));

Which has the same complexity as std::set_union .std::set_union具有相同的复杂性。

std::merge merges all elements, without eliminating the duplicates, while std::set_union eliminates the duplicates. std::merge合并所有元素,但不消除重复项,而std::set_union消除重复项。 That is, the latter applies the rule of union operation of set theory .也就是说,后者应用了集合论并集运算规则。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM