简体   繁体   中英

Why does vector comparison with < operator compare each item twice?

In this example comparing two vectors with < operator results in operator <, defined on the Integer class, being called twice for each element. However, this doesn't happen when comparing two vectors with == operator.

#include<iostream>
#include<vector>

class Integer {
    public:
        Integer(int value) : m_value(value) {}
        friend bool operator<(const Integer& lhs, const Integer& rhs);
        friend bool operator==(const Integer& lhs, const Integer& rhs);

    private:
        int m_value;

}; 
bool operator<(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << " < " << rhs.m_value << '\n';
            return lhs.m_value < rhs.m_value;
}
bool operator==(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << " == " << rhs.m_value << '\n';
            return lhs.m_value == rhs.m_value;
}


int main()
{
    std::vector<Integer> ivec1 {1,2,3};
    std::vector<Integer> ivec2 {1,2,3};
    std::cout << (ivec1 < ivec2) << '\n';
    std::cout << (ivec1 == ivec2) << std::endl;
    return 0;
}

This code prints:

1 < 1
1 < 1
2 < 2
2 < 2
3 < 3
3 < 3
0
1 == 1
2 == 2
3 == 3
1

Why is that so?

If a < b returns false , it doesn't tell you whether b < a , and you have to test that. That's because the element-by-element ordering of std::vector can have three outcomes for one pair of elements a, b :

  • a < b , the vector comparison returns true .
  • b < a , the vector comparison returns false .
  • Neither of the above, the next pair of elements must be tested.

So it has to compare both directions. You could see this more clearly by adding identification data to your class:

#include<iostream>
#include<vector>

class Integer {
    public:
        Integer(int value, char tag) : m_value(value), m_tag(tag) {}
        friend bool operator<(const Integer& lhs, const Integer& rhs);
        friend bool operator==(const Integer& lhs, const Integer& rhs);

    private:
        int m_value;
        char m_tag;

}; 
bool operator<(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << ' ' << lhs.m_tag << " < " << rhs.m_value << ' ' << rhs.m_tag << '\n';
            return lhs.m_value < rhs.m_value;
}
bool operator==(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << ' ' << lhs.m_tag << " == " << rhs.m_value << ' ' << rhs.m_tag << '\n';
            return lhs.m_value == rhs.m_value;
}


int main()
{
    std::vector<Integer> ivec1 {{1, 'a'} ,{2, 'a'}, {3, 'a'}};
    std::vector<Integer> ivec2 {{1, 'b'} ,{2, 'b'}, {3, 'b'}};
    std::cout << (ivec1 < ivec2) << '\n';
    std::cout << (ivec1 == ivec2) << std::endl;
    return 0;
}

This produces:

1 a < 1 b    
1 b < 1 a    
2 a < 2 b    
2 b < 2 a    
3 a < 3 b    
3 b < 3 a    
0    
1 a == 1 b    
2 a == 2 b    
3 a == 3 b    
1

[Live example]

In order to find the lexicographical ordering of ivec1 and ivec2 , the implementation looks for the first index i for which ivec1[i] < ivec2[i] or ivec2[i] < ivec1[i] , as that would determine the order.

Note how this needs two comparisons if ivec1[i] < ivec2[i] is false. In particular, the aforementioned case leaves two possibilities, namely " ivec1[i] and ivec2[i] compare equivalent" and " ivec2[i] < ivec1[i] ". This decision is where the second comparison is needed.


As soon as such an index i is found, the implementation can stop the comparison; but as all entries compare equivalent in your example, two comparisons have to be performed for each pair of entries.

This is due to a flaw in the design of how C++ currently handles comparison. They are fixing it in ; I don't know if it will get to vector , but the fundamental problem will be fixed.

First, the problem.

std::vector 's < is based off of each elements < . But < is a poor tool for this job.

If you have two elements a and b , to lexographically order the tuple a,b you need to do:

if (self.a < other.a)
  return true;
if (other.a < self.a)
  return false;
return self.b < other.b;

in general, this requires 2N-1 calls to < if you want to lexographicaly order a collection of N elements.

This has been known for a long time, and is why strcmp returns an integer with 3 kinds of values: -1 for less, 0 for equal and +1 for greater (or, in general, a value less than, equal to or greater than zero).

With that you can do:

auto val = compare( self.a, other.a );
if (val != 0) return val;
return compare( self.b, other.b );

this requires up to N calls to compare per element in the collection.

Now, the fix.

adds the spaceship comparison operator <=> . It returns a type which can be compared greater than or less than zero, and whose exact type advertises what guarantees the operation provides.

This acts like C's strcmp , but works on any type that supports it. Further, there are std functions which use <=> if available and otherwise use < and == and the like to emulate it.

Assuming vector's requirements are rewritten to use <=> , types with <=> will avoid the double-compare and just be <=> 'd at most once each to do the lexographic ordering of std::vector when < is called.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM