简体   繁体   English

为什么与<运算符的矢量比较比较每个项目两次?

[英]Why does vector comparison with < operator compare each item twice?

In this example comparing two vectors with < operator results in operator <, defined on the Integer class, being called twice for each element. 在这个例子中,比较两个向量与<operator result in operator <,在Integer类上定义,为每个元素调用两次。 However, this doesn't happen when comparing two vectors with == operator. 但是,将两个向量与==运算符进行比较时不会发生这种情况。

#include<iostream>
#include<vector>

class Integer {
    public:
        Integer(int value) : m_value(value) {}
        friend bool operator<(const Integer& lhs, const Integer& rhs);
        friend bool operator==(const Integer& lhs, const Integer& rhs);

    private:
        int m_value;

}; 
bool operator<(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << " < " << rhs.m_value << '\n';
            return lhs.m_value < rhs.m_value;
}
bool operator==(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << " == " << rhs.m_value << '\n';
            return lhs.m_value == rhs.m_value;
}


int main()
{
    std::vector<Integer> ivec1 {1,2,3};
    std::vector<Integer> ivec2 {1,2,3};
    std::cout << (ivec1 < ivec2) << '\n';
    std::cout << (ivec1 == ivec2) << std::endl;
    return 0;
}

This code prints: 此代码打印:

1 < 1
1 < 1
2 < 2
2 < 2
3 < 3
3 < 3
0
1 == 1
2 == 2
3 == 3
1

Why is that so? 为什么会这样?

If a < b returns false , it doesn't tell you whether b < a , and you have to test that. 如果a < b返回false ,则它不会告诉您b < a ,并且您必须测试它。 That's because the element-by-element ordering of std::vector can have three outcomes for one pair of elements a, b : 那是因为std::vector的逐元素排序对于一对元素a, b可以有三个结果:

  • a < b , the vector comparison returns true . a < b ,矢量比较返回true
  • b < a , the vector comparison returns false . b < a ,向量比较返回false
  • Neither of the above, the next pair of elements must be tested. 以上两者都不必测试下一对元素。

So it has to compare both directions. 所以它必须比较两个方向。 You could see this more clearly by adding identification data to your class: 通过向您的班级添加标识数据,您可以更清楚地看到这一点:

#include<iostream>
#include<vector>

class Integer {
    public:
        Integer(int value, char tag) : m_value(value), m_tag(tag) {}
        friend bool operator<(const Integer& lhs, const Integer& rhs);
        friend bool operator==(const Integer& lhs, const Integer& rhs);

    private:
        int m_value;
        char m_tag;

}; 
bool operator<(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << ' ' << lhs.m_tag << " < " << rhs.m_value << ' ' << rhs.m_tag << '\n';
            return lhs.m_value < rhs.m_value;
}
bool operator==(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << ' ' << lhs.m_tag << " == " << rhs.m_value << ' ' << rhs.m_tag << '\n';
            return lhs.m_value == rhs.m_value;
}


int main()
{
    std::vector<Integer> ivec1 {{1, 'a'} ,{2, 'a'}, {3, 'a'}};
    std::vector<Integer> ivec2 {{1, 'b'} ,{2, 'b'}, {3, 'b'}};
    std::cout << (ivec1 < ivec2) << '\n';
    std::cout << (ivec1 == ivec2) << std::endl;
    return 0;
}

This produces: 这会产生:

1 a < 1 b    
1 b < 1 a    
2 a < 2 b    
2 b < 2 a    
3 a < 3 b    
3 b < 3 a    
0    
1 a == 1 b    
2 a == 2 b    
3 a == 3 b    
1

[Live example] [实例]

In order to find the lexicographical ordering of ivec1 and ivec2 , the implementation looks for the first index i for which ivec1[i] < ivec2[i] or ivec2[i] < ivec1[i] , as that would determine the order. 为了找到ivec1ivec2的字典顺序,该实现寻找ivec1[i] < ivec2[i]ivec2[i] < ivec1[i]的第一个索引i ,因为这将决定顺序。

Note how this needs two comparisons if ivec1[i] < ivec2[i] is false. 注意如果ivec1[i] < ivec2[i]为假,这需要两次比较。 In particular, the aforementioned case leaves two possibilities, namely " ivec1[i] and ivec2[i] compare equivalent" and " ivec2[i] < ivec1[i] ". 特别地,上述情况留下两种可能性,即“ ivec1[i]ivec2[i]比较等价物”和“ ivec2[i] < ivec1[i] ”。 This decision is where the second comparison is needed. 这个决定是需要进行第二次比较的地方。


As soon as such an index i is found, the implementation can stop the comparison; 一旦找到这样的索引i ,实现就可以停止比较; but as all entries compare equivalent in your example, two comparisons have to be performed for each pair of entries. 但由于所有条目都比较了您的示例中的等效项,因此必须对每对条目执行两次比较。

This is due to a flaw in the design of how C++ currently handles comparison. 这是因为C ++目前处理比较的设计存在缺陷。 They are fixing it in ; 他们正在用修复它; I don't know if it will get to vector , but the fundamental problem will be fixed. 我不知道它是否会得到vector ,但基本问题将得到解决。

First, the problem. 一,问题。

std::vector 's < is based off of each elements < . std::vector<基于每个元素< But < is a poor tool for this job. 但是<是一项糟糕的工具。

If you have two elements a and b , to lexographically order the tuple a,b you need to do: 如果你有两个元素ab ,要以字面顺序排列元组a,b你需要这样做:

if (self.a < other.a)
  return true;
if (other.a < self.a)
  return false;
return self.b < other.b;

in general, this requires 2N-1 calls to < if you want to lexographicaly order a collection of N elements. 一般来说,这需要2N-1次调用<如果你想要lexographicaly订购N个元素的集合。

This has been known for a long time, and is why strcmp returns an integer with 3 kinds of values: -1 for less, 0 for equal and +1 for greater (or, in general, a value less than, equal to or greater than zero). 这已经知道了很长时间,这就是为什么strcmp返回一个包含3种值的整数: -1表示less, 0表示相等, +1表示更大(或者,通常,值小于,等于或大于比零)。

With that you can do: 有了这个你可以做:

auto val = compare( self.a, other.a );
if (val != 0) return val;
return compare( self.b, other.b );

this requires up to N calls to compare per element in the collection. 这需要最多N调用来compare集合中的每个元素。

Now, the fix. 现在,修复。

adds the spaceship comparison operator <=> . 添加了太空船比较运算符<=> It returns a type which can be compared greater than or less than zero, and whose exact type advertises what guarantees the operation provides. 它返回一个可以比较大于或小于零的类型,其确切类型通告保证操作提供的内容。

This acts like C's strcmp , but works on any type that supports it. 这类似于C的strcmp ,但适用于支持它的任何类型。 Further, there are std functions which use <=> if available and otherwise use < and == and the like to emulate it. 此外,有std函数使用<=>如果可用),否则使用<==等来模拟它。

Assuming vector's requirements are rewritten to use <=> , types with <=> will avoid the double-compare and just be <=> 'd at most once each to do the lexographic ordering of std::vector when < is called. 假设向量的要求被重写为使用<=> ,带<=>类型将避免双重比较,并且每次最多只需<=> 'd,以便在调用<时执行std::vector的词法排序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM