简体   繁体   English

stringstream、string 和 char* 转换混淆

[英]stringstream, string, and char* conversion confusion

My question can be boiled down to, where does the string returned from stringstream.str().c_str() live in memory, and why can't it be assigned to a const char* ?我的问题可以归结为,从stringstream.str().c_str()返回的字符串在内存中的何处存在,为什么不能将其分配给const char*

This code example will explain it better than I can这个代码示例将比我能更好地解释它

#include <string>
#include <sstream>
#include <iostream>

using namespace std;

int main()
{
    stringstream ss("this is a string\n");

    string str(ss.str());

    const char* cstr1 = str.c_str();

    const char* cstr2 = ss.str().c_str();

    cout << cstr1   // Prints correctly
        << cstr2;   // ERROR, prints out garbage

    system("PAUSE");

    return 0;
}

The assumption that stringstream.str().c_str() could be assigned to a const char* led to a bug that took me a while to track down. stringstream.str().c_str()可以分配给const char*的假设导致了一个错误,我花了一段时间才找到它。

For bonus points, can anyone explain why replacing the cout statement with对于加分,任何人都可以解释为什么在更换cout with语句

cout << cstr            // Prints correctly
    << ss.str().c_str() // Prints correctly
    << cstr2;           // Prints correctly (???)

prints the strings correctly?正确打印字符串?

I'm compiling in Visual Studio 2008.我正在 Visual Studio 2008 中编译。

stringstream.str() returns a temporary string object that's destroyed at the end of the full expression. stringstream.str()返回一个临时字符串对象,该对象在完整表达式结束时被销毁。 If you get a pointer to a C string from that ( stringstream.str().c_str() ), it will point to a string which is deleted where the statement ends.如果你从中得到一个指向 C 字符串的指针( stringstream.str().c_str() ),它将指向一个在语句结束处被删除的字符串。 That's why your code prints garbage.这就是您的代码打印垃圾的原因。

You could copy that temporary string object to some other string object and take the C string from that one:您可以将该临时字符串对象复制到其他字符串对象,并从该对象中获取 C 字符串:

const std::string tmp = stringstream.str();
const char* cstr = tmp.c_str();

Note that I made the temporary string const , because any changes to it might cause it to re-allocate and thus render cstr invalid.请注意,我创建了临时字符串const ,因为对它的任何更改都可能导致它重新分配,从而使cstr无效。 It is therefor safer to not to store the result of the call to str() at all and use cstr only until the end of the full expression:因此,根本不存储对str()调用的结果并仅在完整表达式结束之前使用cstr更安全:

use_c_str( stringstream.str().c_str() );

Of course, the latter might not be easy and copying might be too expensive.当然,后者可能并不容易,复制可能太昂贵了。 What you can do instead is to bind the temporary to a const reference.您可以做的是将临时对象绑定到const引用。 This will extend its lifetime to the lifetime of the reference:这会将其生命周期延长到引用的生命周期:

{
  const std::string& tmp = stringstream.str();   
  const char* cstr = tmp.c_str();
}

IMO that's the best solution. IMO 这是最好的解决方案。 Unfortunately it's not very well known.不幸的是,它不是很出名。

What you're doing is creating a temporary.你正在做的是创建一个临时的。 That temporary exists in a scope determined by the compiler, such that it's long enough to satisfy the requirements of where it's going.该临时存在于由编译器确定的范围内,因此它的长度足以满足它要去的地方的要求。

As soon as the statement const char* cstr2 = ss.str().c_str();只要声明const char* cstr2 = ss.str().c_str(); is complete, the compiler sees no reason to keep the temporary string around, and it's destroyed, and thus your const char * is pointing to free'd memory.完成后,编译器认为没有理由保留临时字符串,它已被销毁,因此您的const char *指向已释放的内存。

Your statement string str(ss.str());你的语句string str(ss.str()); means that the temporary is used in the constructor for the string variable str that you've put on the local stack, and that stays around as long as you'd expect: until the end of the block, or function you've written.意味着临时变量在构造函数中用于您放在本地堆栈上的string变量str ,并且只要您期望它就会一直存在:直到块的末尾,或者您编写的函数。 Therefore the const char * within is still good memory when you try the cout .因此,当您尝试cout时,其中的const char *仍然是很好的记忆。

In this line:在这一行:

const char* cstr2 = ss.str().c_str();

ss.str() will make a copy of the contents of the stringstream. ss.str()复制stringstream 的内容。 When you call c_str() on the same line, you'll be referencing legitimate data, but after that line the string will be destroyed, leaving your char* to point to unowned memory.当您在同一行调用c_str()时,您将引用合法数据,但在该行之后字符串将被销毁,让您的char*指向无主内存。

The std::string object returned by ss.str() is a temporary object that will have a life time limited to the expression. ss.str() 返回的 std::string 对象是一个临时对象,其生命周期仅限于表达式。 So you cannot assign a pointer to a temporary object without getting trash.所以你不能在没有垃圾的情况下分配一个指向临时对象的指针。

Now, there is one exception: if you use a const reference to get the temporary object, it is legal to use it for a wider life time.现在,有一个例外:如果您使用 const 引用来获取临时对象,则在更长的生命周期内使用它是合法的。 For example you should do:例如你应该这样做:

#include <string>
#include <sstream>
#include <iostream>

using namespace std;

int main()
{
    stringstream ss("this is a string\n");

    string str(ss.str());

    const char* cstr1 = str.c_str();

    const std::string& resultstr = ss.str();
    const char* cstr2 = resultstr.c_str();

    cout << cstr1       // Prints correctly
        << cstr2;       // No more error : cstr2 points to resultstr memory that is still alive as we used the const reference to keep it for a time.

    system("PAUSE");

    return 0;
}

That way you get the string for a longer time.这样你就可以得到更长的时间。

Now, you have to know that there is a kind of optimisation called RVO that say that if the compiler see an initialization via a function call and that function return a temporary, it will not do the copy but just make the assigned value be the temporary.现在,您必须知道有一种称为 RVO 的优化,它说如果编译器通过函数调用看到初始化并且该函数返回一个临时值,它不会进行复制,而只是使分配的值成为临时值. That way you don't need to actually use a reference, it's only if you want to be sure that it will not copy that it's necessary.这样你就不需要实际使用引用,只有当你想确保它不会复制它是必要的。 So doing:这样做:

 std::string resultstr = ss.str();
 const char* cstr2 = resultstr.c_str();

would be better and simpler.会更好更简单。

The ss.str() temporary is destroyed after initialization of cstr2 is complete. ss.str()临时文件在cstr2初始化完成后被销毁。 So when you print it with cout , the c-string that was associated with that std::string temporary has long been destoryed, and thus you will be lucky if it crashes and asserts, and not lucky if it prints garbage or does appear to work.因此,当您使用cout打印它时,与该std::string临时相关联的 c 字符串早已被销毁,因此,如果它崩溃并断言,您将很幸运,而如果它打印垃圾或看起来确实如此,则不走运工作。

const char* cstr2 = ss.str().c_str();

The C-string where cstr1 points to, however, is associated with a string that still exists at the time you do the cout - so it correctly prints the result.但是, cstr1指向的 C 字符串与执行cout时仍然存在的字符串相关联 - 因此它可以正确打印结果。

In the following code, the first cstr is correct (i assume it is cstr1 in the real code?).在下面的代码中,第一个cstr是正确的(我假设它是真实代码中的cstr1 ?)。 The second prints the c-string associated with the temporary string object ss.str() .第二个打印与临时字符串对象ss.str()关联的 c 字符串。 The object is destroyed at the end of evaluating the full-expression in which it appears.对象在其出现的完整表达式求值结束时被销毁。 The full-expression is the entire cout << ... expression - so while the c-string is output, the associated string object still exists.完整表达式是整个cout << ...表达式 - 因此当输出 c 字符串时,关联的字符串对象仍然存在。 For cstr2 - it is pure badness that it succeeds.对于cstr2 - 它成功是纯粹的cstr2 It most possibly internally chooses the same storage location for the new temporary which it already chose for the temporary used to initialize cstr2 .它很可能在内部为新的临时文件选择相同的存储位置,它已经为用于初始化cstr2的临时文件选择了相同的存储位置。 It could aswell crash.它也可能崩溃。

cout << cstr            // Prints correctly
    << ss.str().c_str() // Prints correctly
    << cstr2;           // Prints correctly (???)

The return of c_str() will usually just point to the internal string buffer - but that's not a requirement. c_str()的返回通常只指向内部字符串缓冲区 - 但这不是必需的。 The string could make up a buffer if its internal implementation is not contiguous for example (that's well possible - but in the next C++ Standard, strings need to be contiguously stored).例如,如果字符串的内部实现不连续,则该字符串可以构成缓冲区(这很有可能 - 但在下一个 C++ 标准中,字符串需要连续存储)。

In GCC, strings use reference counting and copy-on-write.在 GCC 中,字符串使用引用计数和写时复制。 Thus, you will find that the following holds true (it does, at least on my GCC version)因此,您会发现以下情况成立(至少在我的 GCC 版本中确实如此)

string a = "hello";
string b(a);
assert(a.c_str() == b.c_str());

The two strings share the same buffer here.这两个字符串在这里共享相同的缓冲区。 At the time you change one of them, the buffer will be copied and each will hold its separate copy.当您更改其中之一时,缓冲区将被复制并且每个缓冲区都将保存其单独的副本。 Other string implementations do things different, though.但是,其他字符串实现做的事情不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM