简体   繁体   English

使用 C++20 三向比较进行更静默的行为变化

[英]More silent behaviour changes with C++20 three-way comparison

To my surprise, I ran into another snag like C++20 behaviour breaking existing code with equality operator?令我惊讶的是,我遇到了另一个障碍,比如C++20 行为用相等运算符破坏现有代码? . .

Consider a simple case-insensitive key type, to be used with, eg, std::set or std::map :考虑一个简单的不区分大小写的键类型,与例如std::setstd::map

// Represents case insensitive keys
struct CiKey : std::string {
    using std::string::string;
    using std::string::operator=;

    bool operator<(CiKey const& other) const {
        return boost::ilexicographical_compare(*this, other);
    }
};

Simple tests:简单测试:

using KeySet   = std::set<CiKey>;
using Mapping  = std::pair<CiKey, int>; // Same with std::tuple
using Mappings = std::set<Mapping>;

int main()
{
    KeySet keys { "one", "two", "ONE", "three" };
    Mappings mappings {
        { "one", 1 }, { "two", 2 }, { "ONE", 1 }, { "three", 3 }
    };

    assert(keys.size() == 3);
    assert(mappings.size() == 3);
}
  • Using C++17, both asserts pass ( Compiler Explorer ).使用 C++17,两个断言都通过(编译器资源管理器)。

  • Switching to C++20, the second assert fails ( Compiler Explorer )切换到 C++20,第二个断言失败(编译器资源管理器

    output.s: ./example.cpp:28: int main(): Assertion `mappings.size() == 3' failed. output.s: ./example.cpp:28: int main(): 断言 `mappings.size() == 3' 失败。


Obvious Workaround明显的解决方法

An obvious work-around is to conditionally supply operator<=> in C++20 mode: Compile Explorer一个明显的解决方法是在 C++20 模式下有条件地提供operator<=>编译资源管理器

#if defined(__cpp_lib_three_way_comparison)
    std::weak_ordering operator<=>(CiKey const& other) const {
        if (boost::ilexicographical_compare(*this, other)) {
            return std::weak_ordering::less;
        } else if (boost::ilexicographical_compare(other, *this)) {
            return std::weak_ordering::less;
        }
        return std::weak_ordering::equivalent;
    }
#endif

Question问题

It surprises me that I ran into another case of breaking changes - where C++20 changes behaviour of code without diagnostic.令我惊讶的是,我遇到了另一种破坏性更改的情况——C++20 在没有诊断的情况下更改了代码的行为。

On my reading of std::tuple::operator< it should have worked:在我阅读std::tuple::operator<时它应该有效:

3-6) Compares lhs and rhs lexicographically by operator< , that is, compares the first elements, if they are equivalent, compares the second elements, if those are equivalent, compares the third elements, and so on. 3-6) 通过operator<按字典顺序比较lhsrhs ,即比较第一个元素,如果它们等价,则比较第二个元素,如果它们等价,则比较第三个元素,依此类推。 For non-empty tuples, (3) is equivalent to对于非空元组,(3) 等价于

if (std::get<0>(lhs) < std::get<0>(rhs)) return true; if (std::get<0>(rhs) < std::get<0>(lhs)) return false; if (std::get<1>(lhs) < std::get<1>(rhs)) return true; if (std::get<1>(rhs) < std::get<1>(lhs)) return false; ... return std::get<N - 1>(lhs) < std::get<N - 1>(rhs);

I understand that technically these don't apply since C++20, and it gets replaced by:我知道从技术上讲,这些从 C++20 开始就不再适用,它被替换为:

Compares lhs and rhs lexicographically by synthesized three-way comparison (see below), that is, compares the first elements, if they are equivalent, compares the second elements, if those are equivalent, compares the third elements, and so on通过综合三向比较(见下文)按字典顺序比较lhsrhs ,即比较第一个元素,如果它们相等,比较第二个元素,如果相等,比较第三个元素,依此类推

Together with和...一起

The <, <=, >, >=, and != operators are synthesized from operator<=> and operator== respectively. <、<=、>、>= 和 != 运算符分别由operator<=>operator==合成。 (since C++20) (C++20 起)

The thing is,事情是,

  • my type doesn't define operator<=> nor operator== ,我的类型没有定义operator<=>也没有operator==

  • and as this answer points out providing operator< in addition would be fine and should be used when evaluating simple expressions like a < b .正如这个答案指出的那样,另外提供operator<会很好,并且应该在评估像a < b这样的简单表达式时使用。

  1. Is the behavior change in C++20 correct/on purpose? C++20 中的行为更改是否正确/故意?
  2. Should there be a diagnostic?应该有诊断吗?
  3. Can we use other tools to spot silent breakage like this?我们可以使用其他工具来发现像这样的无声破损吗? It feels like scanning entire code-bases for usage of user-defined types in tuple / pair doesn't scale well.感觉就像扫描整个代码库以在tuple / pair中使用用户定义的类型并不能很好地扩展。
  4. Are there other types, beside tuple / pair that could manifest similar changes?除了tuple / pair之外,还有其他类型可以表现出类似的变化吗?

The basic problem comes from the facts that your type is incoherent and the standard library didn't call you on it until C++20.基本问题来自于您的类型不连贯并且标准库直到 C++20 才调用您的事实。 That is, your type was always kind of broken, but things were narrowly enough defined that you could get away with it.也就是说,你的类型总是有点坏,但事情的定义足够狭隘,你可以侥幸逃脱。

Your type is broken because its comparison operators make no sense.您的类型已损坏,因为它的比较运算符没有意义。 It advertises that it is fully comparable, with all of the available comparison operators defined.宣称它是完全可比较的,并定义了所有可用的比较运算符。 This happens because you publicly inherited from std::string , so your type inherits those operators by implicit conversion to the base class.发生这种情况是因为您从std::string公开继承,因此您的类型通过隐式转换为基本 class 来继承这些运算符。 But the behavior of this slate of comparisons is incorrect because you replaced only one of them with a comparison that doesn't work like the rest.但是这个比较列表的行为是不正确的,因为您只用一个不像 rest 那样工作的比较替换了其中一个。

And since the behavior is inconsistent, what could happen is up for grabs once C++ actually cares about you being consistent.而且由于行为不一致,一旦 C++ 真正关心您的一致性,可能会发生什么。

A larger problem however is an inconsistency with how the standard treats operator<=> .然而,更大的问题是与标准如何处理operator<=>不一致。

The C++ language is designed to give priority to explicitly defined comparison operators before employing synthesized operators. C++ 语言旨在在使用综合运算符之前优先考虑显式定义的比较运算符。 So your type inherited from std::string will use your operator< if you compare them directly.因此,如果您直接比较它们,则从std::string继承的类型将使用您的operator<

C++ the library however sometimes tries to be clever. C++ 该库有时会尝试变得聪明。

Some types attempt to forward the operators provided by a given type, like optional<T> .某些类型尝试转发给定类型提供的运算符,例如optional<T> It is designed to behave identically to T in its comparability, and it succeeds at this.它被设计为在可比性上与T表现相同,并且在这方面取得了成功。

However, pair and tuple try to be a bit clever.然而, pairtuple尝试有点聪明。 In C++17, these types never actually forwarded comparison behavior;在 C++17 中,这些类型从未真正转发比较行为; instead, it synthesized comparison behavior based on existing operator< and operator== definitions on the types.相反,它根据类型上现有的operator<operator==定义综合了比较行为。

So it's no surprise that their C++20 incarnations continue that fine tradition of synthesizing comparisons.因此,他们的 C++20 版本延续了综合比较的优良传统也就不足为奇了。 Of course, since the language got in on that game, the C++20 versions decided that it was best to just follow their rules.当然,由于该语言进入了该游戏,C++20 版本决定最好遵循他们的规则。

Except... it couldn't follow them exactly .除了......它不能完全跟随他们。 There's no way to detect whether a < comparison is synthesized or user-provided.无法检测<比较是合成的还是用户提供的。 So there's no way to implement the language behavior in one of these types.因此,无法以其中一种类型实现语言行为。 However, you can detect the presence of three-way comparison behavior.但是,您可以检测到存在三向比较行为。

So they make an assumption: if your type is three-way comparable, then your type is relying on synthesized operators (if it isn't, it uses an improved form of the old method).所以他们做了一个假设:如果你的类型是三向可比的,那么你的类型依赖于综合运算符(如果不是,它使用旧方法的改进形式)。 Which is the right assumption;这是正确的假设; after all, since <=> is a new feature, old types can't possibly get one.毕竟,由于<=>是一项新功能,旧类型不可能获得它。

Unless of course an old type inherits from a new type that gained three-way comparability.当然,除非旧类型继承自获得三向可比性的新类型。 And there's no way for a type to detect that either;并且类型也无法检测到这一点。 it either is three-way comparable or it isn't.它要么是三向可比的,要么不是。

Now fortunately, the synthesized three-way comparison operators of pair and tuple are perfectly capable of mimicking the C++17 behavior if your type doesn't offer three-way comparison functionality.现在幸运的是,如果您的类型提供三向比较功能,则pairtuple的合成三向比较运算符完全能够模仿 C++17 行为。 So you can get back the old behavior by explicitly dis-inheriting the three-way comparison operator in C++20 by deleting the operator<=> overload.因此,您可以通过删除operator<=>重载显式地取消继承 C++20 中的三向比较运算符来恢复旧行为。

Alternatively, you could use private inheritance and simply publicly using the specific APIs you wanted.或者,您可以使用私有 inheritance 并简单地公开using您想要的特定 API。

Is the behavior change in c++20 correct/on purpose? c++20 中的行为变化是否正确/故意?

That depends on what you mean by "on purpose".这取决于您所说的“故意”。

Publicly inheriting from types like std::string has always been somewhat morally dubious.从像std::string这样的类型公开继承在道德上一直有些可疑。 Not so much because of the slicing/destructor problem, but more because it is kind of a cheat.与其说是因为切片/析构函数问题,不如说是因为它有点作弊。 Inheriting such types directly opens you up to changes in the API that you didn't expect and may not be appropriate for your type.继承此类类型会直接让您了解 API 中的更改,这是您未曾预料到的,并且可能不适合您的类型。

The new comparison version of pair and tuple are doing their jobs and doing them as best as C++ can permit. pairtuple的新比较版本正在尽其所能,并在 C++ 允许的范围内做到最好。 It's just that your type inherited something it didn't want.只是你的类型继承了它不想要的东西。 If you had privately inherited from std::string and only using -exposed the functionality you wanted, your type would likely be fine.如果您从std::string私下继承并且仅using -exposed 您想要的功能,那么您的类型可能会很好。

Should there be a diagnostic?应该有诊断吗?

This can't be diagnosed outside of some compiler-intrinsic.这无法在某些编译器内在之外进行诊断。

Can we use other tools to spot silent breakage like this?我们可以使用其他工具来发现像这样的无声破损吗?

Search for case where you're publicly inheriting from standard library types.搜索您从标准库类型公开继承的情况。

Ah!啊! @StoryTeller nailed it with their comment : @StoryTeller 用他们的评论指出了这一点:

"my type doesn't define operator<=> nor operator==" - but std::string does, making it a candidate due to the d[e]rived-to-base conversion. “我的类型没有定义 operator<=> 或 operator==" - 但std::string确实如此,由于 d[e]rived-to-base 转换,它成为候选者。 I believe all standard library types that support comparison had their members overhauled.我相信所有支持比较的标准库类型都对其成员进行了大修。

Indeed, a much quicker work-around is:事实上,一个更快的解决方法是:

#if defined(__cpp_lib_three_way_comparison)
    std::weak_ordering operator<=>(
        CiKey const&) const = delete;
#endif

Success!成功! Compiler Explorer编译器资源管理器

Better Ideas更好的想法

Better solution, as hinted by StoryTeller's second comment :更好的解决方案,正如 StoryTeller 的第二条评论所暗示的:

I guess non-virtual destructors are no longer the sole compelling reason to avoid inheriting from standard library containers:/我想非虚拟析构函数不再是避免从标准库容器继承的唯一令人信服的理由:/

Would be to avoid inheritance here:将在这里避免 inheritance :

// represents case insensiive keys
struct CiKey {
    std::string _value;

    bool operator<(CiKey const& other) const {
        return boost::ilexicographical_compare(_value, other._value);
    }
};

Of course this requires (some) downstream changes to the using code, but it's conceptually purer and insulates against this type of "standard creep" in the future.当然,这需要对使用代码进行(一些)下游更改,但它在概念上更纯粹,并且在未来与这种类型的“标准蠕变”绝缘。

Compiler Explorer编译器资源管理器

#include <boost/algorithm/string.hpp>
#include <iostream>
#include <set>
#include <version>

// represents case insensiive keys
struct CiKey {
    std::string _value;

    bool operator<(CiKey const& other) const {
        return boost::ilexicographical_compare(_value, other._value);
    }
};

using KeySet   = std::set<CiKey>;
using Mapping  = std::tuple<CiKey, int>;
using Mappings = std::set<Mapping>;

int main()
{
    KeySet keys { { "one" }, { "two" }, { "ONE" }, { "three" } };
    Mappings mappings { { { "one" }, 1 }, { { "two" }, 2 }, { { "ONE" }, 1 },
        { { "three" }, 3 } };

    assert(keys.size() == 3);
    assert(mappings.size() == 3);
}

Remaining Questions剩下的问题

How can we diagnose problems like these.我们如何诊断这些问题。 They're so subtle they will escape code review.它们是如此微妙,以至于它们逃避代码审查。 The situation is exacerbated by there being 2 decades of standard C++ where this worked perfectly fine and predictably.这种情况因标准 C++ 的 2 个十年而更加恶化,这在其中工作得非常好且可预测。

I guess as a sidenote, we can expect any "lifted" operators (thinking of std::variant/std::optional) to have similar pitfalls when used with user-defined types that inherit too much from standard library types.我想作为旁注,当与从标准库类型继承太多的用户定义类型一起使用时,我们可以预期任何“提升”的运算符(考虑 std::variant/std::optional)都会有类似的陷阱。

This is not really an answer on the different behaviors of std::string::operator=() , but I must point out that creating case insensitive strings should be done via customization template parameter Traits .这并不是对std::string::operator=()不同行为的真正答案,但我必须指出,应该通过自定义模板参数Traits创建不区分大小写的字符串。

Example:例子:

// definition of basic_string:
template<
    class CharT,
    class Traits = std::char_traits<CharT>,   // <- this is the customization point.
    class Allocator = std::allocator<CharT>
> class basic_string;

The example of case-insensitive string comes almost straight out from cppreference ( https://en.cppreference.com/w/cpp/string/char_traits ).不区分大小写字符串的示例几乎直接来自 cppreference ( https://en.cppreference.com/w/cpp/string/char_traits )。 I've added using directives for case-insensitive strings.我为不区分大小写的字符串添加了using指令。

#include <cctype>
#include <cwctype>
#include <iostream>
#include <locale>
#include <string>
#include <version>

template <typename CharT> struct ci_traits : public std::char_traits<CharT>
{
    #ifdef __cpp_lib_constexpr_char_traits
    #define CICE constexpr
    #endif

private:
    using base = std::char_traits<CharT>;
    using int_type = typename base::int_type;

    static CICE CharT to_upper(CharT ch)
    {
        if constexpr (sizeof(CharT) == 1)
            return std::toupper(static_cast<unsigned char>(ch));
        else
            return std::toupper(CharT(ch & 0xFFFF), std::locale{});
    }

public:
    using base::to_int_type;
    using base::to_char_type;

    static CICE bool eq(CharT c1, CharT c2)
    {
        return to_upper(c1) == to_upper(c2);
    }
    static CICE bool lt(CharT c1, CharT c2)
    {
        return to_upper(c1) < to_upper(c2);
    }
    static CICE bool eq_int_type(const int_type& c1, const int_type& c2)
    {
        return to_upper(to_char_type(c1)) == to_upper(to_char_type(c2));
    }
    static CICE int compare(const CharT *s1, const CharT *s2, std::size_t n)
    {
        while (n-- != 0)
        {
            if (to_upper(*s1) < to_upper(*s2))
                return -1;
            if (to_upper(*s1) > to_upper(*s2))
                return 1;
            ++s1;
            ++s2;
        }
        return 0;
    }
    static CICE const CharT *find(const CharT *s, std::size_t n, CharT a)
    {
        auto const ua(to_upper(a));
        while (n-- != 0) {
            if (to_upper(*s) == ua)
                return s;
            s++;
        }
        return nullptr;
    }
    #undef CICE
};

using ci_string = std::basic_string<char, ci_traits<char>>;
using ci_wstring = std::basic_string<wchar_t, ci_traits<wchar_t>>;

// TODO consider constexpr support
template <typename CharT, typename Alloc>
inline std::basic_string<CharT, std::char_traits<CharT>, Alloc> string_cast(
    const std::basic_string<CharT, ci_traits<CharT>, Alloc> &src)
{
    return std::basic_string<CharT, std::char_traits<CharT>, Alloc>{
        src.begin(), src.end(), src.get_allocator()};
}

template <typename CharT, typename Alloc>
inline std::basic_string<CharT, ci_traits<CharT>, Alloc> ci_string_cast(
    const std::basic_string<CharT, std::char_traits<CharT>, Alloc> &src)
{
    return std::basic_string<CharT, ci_traits<CharT>>{src.begin(), src.end(),
                                                    src.get_allocator()};
}

int main(int argc, char**) {
    if (argc<=1)
    {
        std::cout << "char\n";
        ci_string hello = "hello";
        ci_string Hello = "Hello";

        // convert a ci_string to a std::string
        std::string x = string_cast(hello);

        // convert a std::string to a ci_string
        auto ci_hello = ci_string_cast(x);

        if (hello == Hello)
            std::cout << string_cast(hello) << " and " << string_cast(Hello)
                    << " are equal\n";

        if (hello == "HELLO")
            std::cout << string_cast(hello) << " and "
                    << "HELLO"
                    << " are equal\n";
    }
    else
    {
        std::cout << "wchar_t\n";
        ci_wstring hello = L"hello";
        ci_wstring Hello = L"Hello";

        // convert a ci_wstring to a std::wstring
        std::wstring x = string_cast(hello);

        // convert a std::wstring to a ci_wstring
        auto ci_hello = ci_string_cast(x);

        if (hello == Hello)
            std::wcout << string_cast(hello) << L" and " << string_cast(Hello) << L" are equal\n";

        if (hello == L"HELLO")
            std::wcout << string_cast(hello) << L" and " << L"HELLO" << L" are equal\n";
    }
}

You can play with it here: https://godbolt.org/z/5ec5sz你可以在这里玩: https://godbolt.org/z/5ec5sz

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM