简体   繁体   English

为什么我们需要在 C++ 字符串中使用空终止符?

[英]Why do we need a null terminator in C++ strings?

I'm new to programming and very new to C++, and I recently came across strings.我是编程新手,也是 C++ 的新手,最近我遇到了字符串。

Why do we need a null terminator at the end of a character list?为什么在字符列表的末尾需要一个空终止符?

I've read answers like since we might not use all the spaces of an array therefore we need the null terminator for the program to know where the string ends eg char[100] = "John" but why can't the program just loop through the array to check how many spaces are filled and hence decide the length?我读过类似的答案,因为我们可能不会使用数组的所有空格,因此我们需要程序的空终止符才能知道字符串在哪里结束,例如char[100] = "John"但为什么程序不能循环通过数组检查填充了多少个空格,从而决定长度?

And if only four characters are filled in the array for the word "John" , what are the others spaces filled with?如果只有四个字符填充在单词"John"的数组中,那么其他空格填充了什么?

The other characters in the array char john[100] = "John" would be filled with zeros, which are all null-terminators.数组char john[100] = "John"的其他字符将填充零,这些都是空终止符。 In general, when you initialize an array and don't provide enough elements to fill it up, the remaining elements are default-initialized:一般来说,当你初始化一个数组并且没有提供足够的元素来填充它时,剩余的元素是默认初始化的:

int foo[3] {5};           // this is {5, 0, 0}
int bar[3] {};            // this is {0, 0, 0}

char john[5] = "John";    // this is {'J', 'o', 'h', 'n', 0}
char peter[5] = "Peter";  // ERROR, initializer string too long
                          // (one null-terminator is mandatory)

Also see cppreference on Array initialization .另请参阅 关于数组初始化的 cppreference To find the length of such a string, we just loop through the characters until we find 0 and exit.要找到这样一个字符串的长度,我们只需遍历字符直到找到0并退出。

The motivation behind null-terminating strings in C++ is to ensure compatibility with C-libraries, which use null-terminated strings. C++ 中空终止字符串背后的动机是确保与使用空终止字符串的 C 库兼容。 Also see What's the rationale for null terminated strings?另请参阅空终止字符串的基本原理是什么?

Containers like std::string don't require the string to be null-terminated and can even store a string containing null-characters.std::string这样的容器不需要字符串以空字符结尾,甚至可以存储包含空字符的字符串。 This is because they store the size of the string separately.这是因为它们分别存储字符串的大小。 However, the characters of a std::string are often null-terminated anyways so that std::string::c_str() doesn't require a modification of the underlying array.但是, std::string通常以空字符结尾,因此std::string::c_str()不需要修改底层数组。

C++-only libraries will rarely -if ever- pass C-strings between functions.仅 C++ 的库很少(如果有的话)在函数之间传递 C 字符串。

The existance of a null terminator is a design decision.空终止符的存在是一种设计决定。 The purpose it serves is marking the end of the string.它的作用是标记字符串的结尾。 There are other ways to do this, for example in Pascal the first element of a string is it's size so no null terminator is needed.还有其他方法可以做到这一点,例如在 Pascal 中,字符串的第一个元素是它的大小,因此不需要空终止符。

In the example you give only the first 5 elements of the array will be initialized, the rest are zero initialized.在你给出的例子中,只有数组的前 5 个元素会被初始化,其余的都是零初始化。 Notice how I said 5 elements and not just four.请注意我是如何说 5 个元素的,而不仅仅是 4 个。 The fifth element is the null terminator.第五个元素是空终止符。

Sure the program can loop through the string to find out it's length but how will it know when to stop looping?当然程序可以循环遍历字符串以找出它的长度,但它如何知道何时停止循环?

The nul terminator is what tells you what spaces are filled. nul 终止符告诉您填充了哪些空格。 Everything up to and including the nul terminator has been filled.直到并包括 nul 终止符的所有内容都已填充。 Everything after it has not.之后的一切都没有了。

There is no general notion of which elements of an array have been filled.没有关于数组的哪些元素已被填充的一般概念。 An array holds some number of elements;一个数组包含一定数量的元素; its size is determined when it is created.它的大小是在创建时确定的。 All of its elements have some value initially;它的所有元素最初都有一些价值; there's no way, in general, to determine which ones have been assigned a value and which ones have not from looking at the values of the elements.通常,无法通过查看元素的值来确定哪些已分配值,哪些尚未分配。

Strings are arrays of char and a coding convention that the "end" of the string is marked by a nul character.字符串是char数组编码约定,即字符串的“结尾”由空字符标记。 Most of the string manipulation functions rely on this convention.大多数字符串操作函数都依赖于这个约定。

A string literal, such as "John" , is an array of char .字符串文字,例如"John" ,是一个char数组。 "John" has 5 elements in the array: 'J' , 'o' , 'h' , 'n' , '\\0' . "John"在数组中有 5 个元素: 'J''o''h''n''\\0' The function strcpy , for example, copies characters until it sees that nul terminator:例如,函数strcpy复制字符,直到它看到 nul 终止符:

char result[100]; // no meaningful values here
strcpy(result, "John");

After the call to strcpy , the first five elements of result are 'J' , 'o' , 'h' , 'n' , and '\\0' .在调用strcpyresult的前五个元素是'J''o''h''n''\\0' The rest of the array elements have no meaningful values.其余的数组元素没有有意义的值。

I would be remiss if I didn't mention that this style of string comes from C, and is often referred to as C-style strings.如果我没有提到这种风格的字符串来自 C,并且通常被称为 C 风格的字符串,那我就失职了。 C++ supports all of the C string stuff, but it also has a more sophisticated notion of a string, std::string , which is completely different. C++ 支持所有 C 字符串的东西,但它也有一个更复杂的字符串概念, std::string ,这是完全不同的。 In general, you should be using C++-style strings and not C-style strings.通常,您应该使用 C++ 风格的字符串,而不是 C 风格的字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM