简体   繁体   English

为什么std :: ends导致字符串比较失败?

[英]Why does std::ends cause string comparison to fail?

I spent about 4 hours yesterday trying to fix this issue in my code. 昨天我花了大约4个小时来尝试在代码中解决此问题。 I simplified the problem to the example below. 我将问题简化为以下示例。

The idea is to store a string in a stringstream ending with std::ends, then retrieve it later and compare it to the original string. 这个想法是将一个字符串存储在以std :: ends结尾的字符串流中,然后在以后检索它,并将其与原始字符串进行比较。

#include <sstream>
#include <iostream>
#include <string>

int main( int argc, char** argv )
{
    const std::string HELLO( "hello" );

    std::stringstream testStream;

    testStream << HELLO << std::ends;

    std::string hi = testStream.str();

    if( HELLO == hi )
    {
        std::cout << HELLO << "==" << hi << std::endl;
    }

    return 0;
}

As you can probably guess, the above code when executed will not print anything out. 您可能会猜到,上面的代码在执行时不会输出任何内容。

Although, if printed out, or looked at in the debugger (VS2005), HELLO and hi look identical, their .length() in fact differs by 1. That's what I am guessing is causing the "==" operator to fail. 虽然,如果打印出来或在调试器(VS2005)中查看,则HELLO和hi看起来相同,但它们的.length()实际上相差1。这就是我猜测的原因,导致“ ==”运算符失败。

My question is why. 我的问题是为什么。 I do not understand why std::ends is an invisible character added to string hi, making hi and HELLO different lengths even though they have identical content. 我不明白为什么std :: ends是添加到字符串hi的不可见字符,即使hi和HELLO的内容相同,它们的长度也不同。 Moreover, this invisible character does not get trimmed with boost trim. 此外,该不可见字符不会被增强修剪所修剪。 However, if you use strcmp to compare .c_str() of the two strings, the comparison works correctly. 但是,如果使用strcmp比较两个字符串的.c_str(),则该比较可以正常进行。

The reason I used std::ends in the first place is because I've had issues in the past with stringstream retaining garbage data at the end of the stream. 首先使用std :: ends的原因是因为过去我遇到过问题,stringstream在流的末尾保留了垃圾数据。 std::ends solved that for me. std :: ends为我解决了这个问题。

std::ends inserts a null character into the stream. std::ends将空字符插入流中。 Getting the content as a std::string will retain that null character and create a string with that null character at the respective positions. std::string的形式获取内容将保留该空字符,并在各个位置创建一个具有该空字符的字符串。

So indeed a std::string can contain embedded null characters. 因此,确实std :: string可以包含嵌入的空字符。 The following std::string contents are different: 下面的std :: string的内容不同的:

ABC
ABC\0

A binary zero is not whitespace. 二进制零不是空格。 But it's also not printable, so you won't see it (unless your terminal displays it specially). 但这也是不可打印的,因此您将看不到它(除非您的终端特别显示它)。

Comparing using strcmp will interpret the content of a std::string as a C string when you pass .c_str() . 当传递.c_str()时,使用strcmp进行比较会将std::string的内容解释为C字符串。 It will say 它会说

Hmm, characters before the first \\0 (terminating null character) are ABC , so i take it the string is ABC 嗯,第一个\\0 (终止空字符)之前的字符是ABC ,所以我认为字符串是ABC

And thus, it will not see any difference between the two above. 因此,它不会看到上述两者之间的任何区别。 You are probably having this issue: 您可能遇到此问题:

std::stringstream s;
s << "hello";
s.seekp(0);
s << "b";
assert(s.str() == "b"); // will fail!

The assert will fail, because the sequence that the stringstream uses is still the old one that contains "hello". 该断言将失败,因为字符串流使用的序列仍然是包含“ hello”的旧序列。 What you did is just overwriting the first character. 您所做的只是覆盖第一个字符。 You want to do this: 您想这样做:

std::stringstream s;
s << "hello";
s.str(""); // reset the sequence
s << "b";
assert(s.str() == "b"); // will succeed!

Also read this answer: How to reuse an ostringstream 还请阅读此答案: 如何重用ostringstream

std::ends is simply a null character. std::ends只是一个空字符。 Traditionally, strings in C and C++ are terminated with a null (ascii 0) character, however it turns out that std::string doesn't really require this thing. 传统上,C和C ++中的字符串以空(ascii 0)字符终止,但是事实证明std::string并不需要这个东西。 Anyway to step through your code point by point we see a few interesting things going on: 无论如何,逐步了解您的代码,我们都会看到一些有趣的事情:

int main( int argc, char** argv )
{

The string literal "hello" is a traditional zero terminated string constant. 字符串文字"hello"是传统的零终止字符串常量。 We copy that whole into the std::string HELLO. 我们将整个过程复制到std::string HELLO中。

    const std::string HELLO( "hello" );

    std::stringstream testStream;

We now put the string HELLO (including the trailing 0) into the stream , followed by a second null which is put there by the call to std::ends . 现在,我们将string HELLO(包括尾随的0)放入stream ,然后是第二个null,该null将通过调用std::ends放置在其中。

    testStream << HELLO << std::ends;

We extract out a copy of the stuff we put into the stream (the literal string "hello", plus the two null terminators). 我们提取出放入stream中的内容的副本(文字字符串“ hello”,再加上两个空终止符)。

    std::string hi = testStream.str();

We then compare the two strings using the operator == on the std::string class. 然后,我们在std::string类上使用operator ==比较两个字符串。 This operator (probably) compares the length of the string objects - including how ever many trailing null characters. 该运算符(可能)比较string对象的长度-包括多少尾随空字符。 Note that the std::string class does not require the underlying character array to end with a trailing null - put another way it allows the string to contain null characters so the first of the two trailing null characters is treated as part of the string hi . 请注意, std::string类不需要底层字符数组以结尾的null结尾-换句话说,它允许字符串包含null字符,因此两个结尾的null字符中的第一个被视为字符串hi一部分。 。

Since the two strings are different in the number of trailing nulls, the comparison fails. 由于两个字符串的尾随null数量不同,因此比较失败。

    if( HELLO == hi )
    {
        std::cout << HELLO << "==" << hi << std::endl;
    }

    return 0;
}

Although, if printed out, or looked at in the debugger (VS2005), HELLO and hi look identical, their .length() in fact differs by 1. That's what I am guessing is causing the "==" operator to fail. 虽然,如果打印出来或在调试器(VS2005)中查看,则HELLO和hi看起来相同,但它们的.length()实际上相差1。这就是我猜测的原因,导致“ ==”运算符失败。

Reason being, the length is different by one trailing null character. 原因是,长度由一个尾随的空字符不同。

My question is why. 我的问题是为什么。 I do not understand why std::ends is an invisible character added to string hi, making hi and HELLO different lengths even though they have identical content. 我不明白为什么std :: ends是添加到字符串hi的不可见字符,即使hi和HELLO的内容相同,它们的长度也不同。 Moreover, this invisible character does not get trimmed with boost trim. 此外,该不可见字符不会被增强修剪所修剪。 However, if you use strcmp to compare .c_str() of the two strings, the comparison works correctly. 但是,如果使用strcmp比较两个字符串的.c_str(),则该比较可以正常进行。

strcmp is different from std::string - it is written from back in the early days when strings were terminated with a null - so when it gets to the first trailing null in hi it stops looking. strcmpstd::string不同-它是在字符串以null终止的早期写的-因此,当它到达hi的第一个尾随null时,它将停止查找。

The reason I used std::ends in the first place is because I've had issues in the past with stringstream retaining garbage data at the end of the stream. 首先使用std :: ends的原因是因为过去我遇到过问题,stringstream在流的末尾保留了垃圾数据。 std::ends solved that for me. std :: ends为我解决了这个问题。

Sometimes it is a good idea to understand the underlying representation. 有时,了解底层表示是一个好主意。

I think to have a good way to compare strings is to use std::find method. 我认为比较字符串的一种好方法是使用std::find方法。 Do not mix C methods and std::string ones ! 不要混合使用C方法和std::string ones

You're adding a NULL char to HELLO with std::ends. 您正在使用std :: ends向HELLO添加NULL字符。 When you initialize hi with str() you are removing the NULL char. 使用str()初始化hi时,将删除NULL字符。 The strings are different. 字符串是不同的。 strcmp doesn't compare std::strings, it compares char* (it's a C function). strcmp不比较std :: strings,而是比较char *(它是C函数)。

std::ends adds a null terminator, (char)'\\0'. std :: ends添加一个空终止符(char)'\\ 0'。 You'd use it with the deprecated strstream classes, to add the null terminator. 您可以将其与不赞成使用的strstream类一起使用,以添加空终止符。

You don't need it with stringstream, and in fact it screws things up, because the null terminator isn't "the special null terminator that ends a string" to stringstream, to stringstream it's just another character, the zeroth character. 您不需要在stringstream中使用它,并且实际上它搞砸了,因为null终止符不是stringstream的“结束字符串的特殊null终止符”,stringstream只是另一个字符,第零个字符。 stringstream just adds it, and that increases the character count (in your case) to seven, and makes the comparison to "hello" fail. stringstream只会添加它,这会使字符数(在您的情况下)增加到七个,并使与“ hello”的比较失败。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM