简体   繁体   English

C ++:“我的文本”是std :: string,* char还是c-string?

[英]C++: Is “my text” a std::string, a *char or a c-string?

I have just done what appears to be a common newbie mistake : 我刚刚做了一个看似常见的新手错误

First we read one of many tutorials that goes like this: 首先,我们阅读了许多教程之一,如下所示:

 #include <fstream>
 int main() {
      using namespace std;
      ifstream inf("file.txt");
      // (...)
 }  

Secondly, we try to use something similar in our code, which goes something like this: 其次,我们尝试在代码中使用类似的东西,如下所示:

#include <fstream>
int main() {
    using namespace std;
    std::string file = "file.txt"; // Or get the name of the file 
                                   // from a function that returns std::string.
    ifstream inf(file);
    // (...)
}

Thirdly, the newbie developer is perplexed by some cryptic compiler error message. 第三,新手开发人员对一些神秘的编译器错误消息感到困惑。

The problem is that ifstream takes const * char as a constructor argument. 问题是ifstreamconst * char作为构造函数参数。

The solution is to convert std::string to const * char . 解决方案将std :: string转换为const * char

Now, the real problem is that, for a newbie, "file.txt" or similar examples given in almost all the tutorials very much looks like a std::string. 现在,真正的问题是,对于一个新手来说,几乎所有教程中给出的“file.txt”或类似的例子看起来都像std :: string。

So, is "my text" a std::string, a c-string or a *char, or does it depend on the context? 那么,“我的文本”是std :: string,c-string还是* char,还是取决于上下文?

Can you provide examples on how "my text" would be interpreted differently according to context? 您能否举例说明“我的文本”将如何根据上下文进行不同的解释?

[Edit: I thought the example above would have made it obvious, but I should have been more explicit nonetheless: what I mean is the type of any string enclosed within double quotes, ie "myfilename.txt", not the meaning of the word 'string'.] [编辑:我认为上面的例子会让它显而易见,但我应该更明确一点:我的意思是双引号中包含的任何字符串的类型,即“myfilename.txt”,而不是单词的含义'串'。]

Thanks. 谢谢。

So, is "string" a std::string, a c-string or a *char, or does it depend on the context? 那么,“string”是std :: string,c-string还是* char,还是取决于上下文?

  • Neither C nor C++ have a built-in string data type, so any double-quoted strings in your code are essentially const char * (or const char [] to be exact). C和C ++都没有内置的字符串数据类型,因此代码中的任何双引号字符串本质上都是const char * (或者精确的是const char [] )。 "C string" usually refers to this, specifically a character array with a null terminator. “C string”通常是指这个,特别是带有null终止符的字符数组。
  • In C++, std::string is a convenience class that wraps a raw string into an object. 在C ++中, std::string是一个将原始字符串包装到对象中的便捷类。 By using this, you can avoid having to do (messy) pointer arithmetic and memory reallocations by yourself. 通过使用它,您可以避免必须自己执行(杂乱)指针算法和内存重新分配。
  • Most standard library functions still take only char * (or const char * ) parameters. 大多数标准库函数仍然只使用char * (或const char * )参数。
  • You can implicitly convert a char * into std::string because the latter has a constructor to do that. 您可以将char *隐式转换为std::string因为后者有一个构造函数来执行此操作。
  • You must explicitly convert a std::string into a const char * by using the c_str() method. 您必须使用c_str()方法将std::string显式转换为const char *

Thanks to Clark Gaebel for pointing out const ness, and jalf and GMan for mentioning that it is actually an array. 感谢Clark Gaebel指出const ,jalf和GMan提到它实际上是一个数组。

"myString" is a string literal, and has the type const char[9] , an array of 9 constant char . "myString"是一个字符串文字,其类型为const char[9] ,一个9个常量char的数组。 Note that it has enough space for the null terminator. 请注意,它有足够的空间用于null终止符。 So "Hi" is a const char[3] , and so forth. 所以"Hi"是一个const char[3] ,依此类推。

This is pretty much always true, with no ambiguity. 这几乎总是正确的,没有歧义。 However, whenever necessary, a const char[9] will decay into a const char* that points to its first element. 但是,必要时, const char[9]将衰减为指向其第一个元素的const char* And std::string has an implicit constructor that accepts a const char* . 并且std::string有一个隐式构造函数,它接受一个const char* So while it always starts as an array of char, it can become the other types if you need it to. 因此,虽然它始终以char数组开头,但如果需要它,它可以成为其他类型。

Note that string literals have the unique property that const char[N] can also decay into char* , but this behavior is deprecated. 请注意,字符串文字具有const char[N]也可以衰减为char*的唯一属性,但不推荐使用此行为。 If you try to modify the underlying string this way, you end up with undefined behavior. 如果您尝试以这种方式修改基础字符串,则最终会出现未定义的行为。 Its just not a good idea. 这不是一个好主意。

std::string file = "file.txt"; 

The right hand side of the = contains a (raw) string literal (ia a null-terminated byte string). =的右侧包含一个(原始)字符串文字(即一个以空字符结尾的字节字符串)。 Its effective type is array of const char . 它的有效类型是array of const char

The = is a tricky pony here: No assignment happens. =这里是一个棘手的小马:没有任务分配。 The std::string class has a constructor that takes a pointer to char as an argument and this is called to create a temporary std::string and this is used to copy-construct (using the copy ctor of std::string ) the object file of type std::string . std::string类有一个构造函数,它将一个指向char的指针作为参数,并调用它来创建一个临时的std::string ,这用于复制构造(使用std::string的copy ctor) std::string类型的目标file

The compiler is free to elide the copy ctor and directly instantiate file though. 编译器可以自由地删除copy ctor并直接实例化文件。

However, note that std:string is not the same thing as a C-style null-terminated string. 但是,请注意, std:string与C样式的以null结尾的字符串不同。 It is not even required to be null-terminated. 它甚至不需要以空终止。

ifstream inf("file.txt");

The std::ifstream class has a ctor that takes a const char * and the string literal passed to it decays to a pointer to the first element of the string. std::ifstream类有一个ctor,它接受一个const char * ,传递给它的字符串文字衰减到一个指向字符串第一个元素的指针。

The thing to remember is this: std::string provides (almost seamless) conversion from C-style strings. 要记住的是: std::string从C风格的字符串提供(几乎无缝)转换。 You have to look up the signature of the function to see if you are passing in a const char * or a std::string (the latter because of implicit conversions). 您必须查找函数的签名,以查看是否传入了const char *std::string (后者是因为隐式转换)。

So, is "string" a std::string , a c-string or a char* , or does it depend on the context? 那么,“string”是std::string ,c-string还是char* ,还是取决于上下文?

It depends entirely on the context. 这完全取决于背景。 :-) Welcome to C++. :-)欢迎使用C ++。

AC string is a null-terminated string, which is almost always the same thing as a char* . AC字符串是以空字符结尾的字符串,它几乎总是与char*相同。

Depending on the platforms and frameworks you are using, there might be even more meanings of the word "string" (for example, it is also used to refer to QString in Qt or CString in MFC). 根据您使用的平台和框架,“string”这个词可能有更多含义(例如,它也用于引用Qt中的QString或MFC中的CString )。

Neither C nor C++ have a built-in string data type. C和C ++都没有内置的字符串数据类型。

When the compiler finds, during the compilation, a double-quoted strings is implicitly referred (see the code below), the string itself is stored in program code/text and generates code to create even character array: 当编译器在编译过程中发现隐式引用双引号字符串时(参见下面的代码),字符串本身存储在程序代码/文本中,并生成代码以创建偶数字符数组:

  • The array is created in static storage because it must persist to be referred later. 该数组是在静态存储中创建的,因为它必须保留以便稍后引用。
  • The array is made to constant because it must always contain the original data (Hello). 数组是常量,因为它必须始终包含原始数据(Hello)。

So at last, what you have is const char * to this constant static character array. 所以最后,你有这个常量静态字符数组的const char *

const char* v()
{
    char* text = “Hello”;
    return text;
    // Above code can be reduced to:
    // return “Hello”;
}

During the program run, when the control finds opening bracket, it creates “text”, the char* pointer, in the stack and constant array of 6 elements (including the null terminator '\\0' at the end) in static memory area. 在程序运行期间,当控件找到开括号时,它在堆栈中创建“text”,char *指针,并在静态存储区中创建6个元素的常量数组(包括末尾的空终止符'\\ 0')。 When control finds next line (char* text = “Hello”;), the starting address of the 6 element array is assigned to “text”. 当控件找到下一行(char * text =“Hello”;)时,6元素数组的起始地址被分配给“text”。 In next line (return text;), it returns “text”. 在下一行(返回文本;)中,它返回“text”。 With the closing bracket “text” will disappear from the stack, but array is still in the static memory area. 使用右括号“text”将从堆栈中消失,但是数组仍然在静态存储区中。

You need not to make return type const. 你不需要使返回类型为const。 But if you try to change the value in static array using non constant char* it will still give you an error during the run time because the array is constant. 但是如果你尝试使用非常量char *更改静态数组中的值,它仍会在运行时给出错误,因为数组是常量。 So, it's always good to make return constant to make sure, it cannot be referred by non constant pointer. 因此,确保返回常量总是好的,它不能被非常量指针引用。

But if the compiler finds a double-quoted strings is explicitly referred as an array, the compiler assumes that the programmer is going to (smartly) handle it. 但是如果编译器发现双引号字符串被明确地称为数组,则编译器会假定程序员将(巧妙地)处理它。 See the following wrong example: 请参阅以下错误示例:

const char* v()
{
    char text[] = “Hello”;
    return text;
}

During the compilation, compiler checks, quoted text and save it as it is in the code to fill the generated array during the runt time. 在编译期间,编译器检查,引用文本并将其保存在代码中,以便在欠幅时间内填充生成的数组。 Also, it calculate the array size, in this case again as 6. 此外,它计算数组大小,在这种情况下再次为6。

During the program run, with the open bracket, the array “text[]” with 6 elements is created in stack. 在程序运行期间,使用open括号,在堆栈中创建具有6个元素的数组“text []”。 But no initialization. 但没有初始化。 When the code finds (char text[] = “Hello”;), the array is initialized (with the text in compiled code). 当代码找到(char text [] =“Hello”;)时,数组被初始化(使用编译代码中的文本)。 So array is now on the stack. 所以数组现在在堆栈上。 When the compiler finds (return text;), it returns the starting address of the array “text”. 当编译器找到(return text;)时,它返回数组“text”的起始地址。 When the compiler find the closing bracket, the array disappears from the stack. 当编译器找到结束括号时,数组将从堆栈中消失。 So no way to refer it by the return pointer. 因此无法通过返回指针引用它。

Most standard library functions still take only char * (or const char *) parameters. 大多数标准库函数仍然只使用char *(或const char *)参数。

The Standard C++ library has a powerful class called string for manipulating text. 标准C ++库有一个强大的类,称为字符串,用于处理文本。 The internal data structure for string is character arrays. 字符串的内部数据结构是字符数组。 The Standard C++ string class is designed to take care of (and hide) all the low-level manipulations of character arrays that were previously required of the C programmer. 标准C ++字符串类旨在处理(并隐藏)以前C程序员所需的字符数组的所有低级操作。 Note that std::string is a class: 请注意,std :: string是一个类:

  • You can implicitly convert a char * into std::string because the latter has a constructor to do that. 您可以将char *隐式转换为std :: string,因为后者有一个构造函数来执行此操作。
  • You can explicitly convert a std::string into a const char * by using the c_str() method. 您可以使用c_str()方法将std :: string显式转换为const char *。

The C++ standard library provides a std::string class to manage and represent character sequences. C ++标准库提供了一个std :: string类来管理和表示字符序列。 It encapsulates the memory management and is most of the time implemented as a C-string; 它封装了内存管理,大部分时间都是作为C字符串实现的; but that is an implementation detail. 但这是一个实施细节。 It also provides manipulation routines for common tasks. 它还为常见任务提供操作例程。

The std::string type will always be that (it doesn't have a conversion operator to char* for example, that's why you have the c_str() method), but it can be initialized or assigned to by a C-string (char*). std :: string类型将始终是(它没有char *的转换运算符,这就是你有c_str()方法的原因),但它可以被C字符串初始化或分配(的char *)。

On the other hand, if you have a function that takes a std::string or a const std::string& as a parameter, you can pass a c-string (char*) to that function and the compiler will construct a std::string in-place for you. 另一方面,如果你有一个函数将std :: string或const std :: string&作为参数,你可以将一个c-string(char *)传递给该函数,编译器将构造一个std: :为你原生地字符串。 That would be a differing interpretation according to context as you put it. 根据你提出的背景,这将是一个不同的解释。

As often as possible it should mean std::string (or an alternative such as wxString , QString , etc., if you're using a framework that supplies such. Sometimes you have no real choice but to use a NUL-terminated byte sequence, but you generally want to avoid it when possible. 它应该尽可能多地意味着std::string (或替代方法,如wxStringQString等,如果你使用的是提供这样的框架。有时你没有真正的选择,只能使用NUL终止的字节序列,但你通常希望尽可能避免它。

Ultimately, there simply is no clear, unambiguous terminology. 最终,根本就没有明确,明确的术语。 Such is life. 这就是人生。

To use the proper wording (as found in the C++ language standard) string is one of the varieties of std::basic_string (including std::string) from chapter 21.3 "String classes" (as in C++0x N3092), while the argument of ifstream's constructor is NTBS (Null-terminated byte sequence) 要使用正确的措辞(如C ++语言标准中所示), 字符串是第21.3节“字符串类”(如C ++ 0x N3092中)中std :: basic_string(包括std :: string)的变种之一,而ifstream的构造函数的参数是NTBS (空终止的字节序列)

To quote, C++0x N3092 27.9.1.4/2. 引用,C ++ 0x N3092 27.9.1.4/2。

basic_filebuf* open(const char* s, ios_base::openmode mode); basic_filebuf * open(const char * s,ios_base :: openmode mode);

... ...

opens a file, if possible, whose name is the NTBS s 如果可能,打开一个文件,其名称为NTBS

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM