简体   繁体   English

C++中string和char[]类型的区别

[英]Difference between string and char[] types in C++

I know a little C and now I'm taking a look at C++.我知道一点 C,现在我正在研究 C++。 I'm used to char arrays for dealing with C strings, but while I look at C++ code I see there are examples using both string type and char arrays:我习惯于使用字符数组来处理 C 字符串,但是当我查看 C++ 代码时,我看到有使用字符串类型和字符数组的示例:

#include <iostream>
#include <string>
using namespace std;

int main () {
  string mystr;
  cout << "What's your name? ";
  getline (cin, mystr);
  cout << "Hello " << mystr << ".\n";
  cout << "What is your favorite team? ";
  getline (cin, mystr);
  cout << "I like " << mystr << " too!\n";
  return 0;
}

and

#include <iostream>
using namespace std;

int main () {
  char name[256], title[256];

  cout << "Enter your name: ";
  cin.getline (name,256);

  cout << "Enter your favourite movie: ";
  cin.getline (title,256);

  cout << name << "'s favourite movie is " << title;

  return 0;
}

(both examples from http://www.cplusplus.com ) (两个例子都来自http://www.cplusplus.com

I suppose this is a widely asked and answered (obvious?) question, but it would be nice if someone could tell me what's exactly the difference between that two ways for dealing with strings in C++ (performance, API integration, the way each one is better, ...).我想这是一个被广泛询问和回答(显而易见?)的问题,但如果有人能告诉我在 C++ 中处理字符串的两种方式之间的确切区别是什么,那就太好了(性能、API 集成、每种方式的方式)更好的, ...)。

Thank you.谢谢你。

A char array is just that - an array of characters: char 数组就是这样 - 一个字符数组:

  • If allocated on the stack (like in your example), it will always occupy eg.如果在堆栈上分配(如您的示例中),它将始终占用例如。 256 bytes no matter how long the text it contains is 256 字节,无论它包含多长的文本
  • If allocated on the heap (using malloc() or new char[]) you're responsible for releasing the memory afterwards and you will always have the overhead of a heap allocation.如果在堆上分配(使用 malloc() 或 new char[]),您负责之后释放内存,并且您将始终拥有堆分配的开销。
  • If you copy a text of more than 256 chars into the array, it might crash, produce ugly assertion messages or cause unexplainable (mis-)behavior somewhere else in your program.如果将超过 256 个字符的文本复制到数组中,它可能会崩溃、产生丑陋的断言消息或在程序的其他地方导致无法解释的(错误)行为。
  • To determine the text's length, the array has to be scanned, character by character, for a \\0 character.要确定文本的长度,必须逐个字符地扫描数组以获取 \\0 字符。

A string is a class that contains a char array, but automatically manages it for you.字符串是一个包含字符数组的类,但会自动为您管理它。 Most string implementations have a built-in array of 16 characters (so short strings don't fragment the heap) and use the heap for longer strings.大多数字符串实现都有一个由 16 个字符组成的内置数组(因此短字符串不会对堆造成碎片),并将堆用于更长的字符串。

You can access a string's char array like this:您可以像这样访问字符串的字符数组:

std::string myString = "Hello World";
const char *myStringChars = myString.c_str();

C++ strings can contain embedded \\0 characters, know their length without counting, are faster than heap-allocated char arrays for short texts and protect you from buffer overruns. C++ 字符串可以包含嵌入的 \\0 字符,无需计算就知道它们的长度,比短文本的堆分配字符数组更快,并保护您免受缓冲区溢出的影响。 Plus they're more readable and easier to use.此外,它们更具可读性和更易于使用。


However, C++ strings are not (very) suitable for usage across DLL boundaries, because this would require any user of such a DLL function to make sure he's using the exact same compiler and C++ runtime implementation, lest he risk his string class behaving differently.然而,C++ 字符串并不(非常)适合跨 DLL 边界使用,因为这将要求此类 DLL 函数的任何用户确保他使用完全相同的编译器和 C++ 运行时实现,以免他的字符串类冒着不同行为的风险。

Normally, a string class would also release its heap memory on the calling heap, so it will only be able to free memory again if you're using a shared (.dll or .so) version of the runtime.通常,字符串类也会在调用堆上释放其堆内存,因此如果您使用的是运行时的共享(.dll 或 .so)版本,它只能再次释放内存。

In short: use C++ strings in all your internal functions and methods.简而言之:在所有内部函数和方法中使用 C++ 字符串。 If you ever write a .dll or .so, use C strings in your public (dll/so-exposed) functions.如果您曾经编写过 .dll 或 .so,请在公共(dll/so-exposed)函数中使用 C 字符串。

Arkaitz is correct that string is a managed type. Arkaitz 认为string是托管类型是正确的。 What this means for you is that you never have to worry about how long the string is, nor do you have to worry about freeing or reallocating the memory of the string.这对来说意味着您永远不必担心字符串有多长,也不必担心释放或重新分配字符串的内存。

On the other hand, the char[] notation in the case above has restricted the character buffer to exactly 256 characters.另一方面,上述情况中的char[]表示法将字符缓冲区限制为恰好 256 个字符。 If you tried to write more than 256 characters into that buffer, at best you will overwrite other memory that your program "owns".如果您尝试将超过 256 个字符写入该缓冲区,则充其量您将覆盖程序“拥有”的其他内存。 At worst, you will try to overwrite memory that you do not own, and your OS will kill your program on the spot.最坏的情况是,您会尝试覆盖不属于您的内存,而您的操作系统会当场杀死您的程序。

Bottom line?底线? Strings are a lot more programmer friendly, char[]s are a lot more efficient for the computer.字符串对程序员更友好,char[] 对计算机来说效率更高。

Well, string type is a completely managed class for character strings, while char[] is still what it was in C, a byte array representing a character string for you.嗯,string 类型是一个完全托管的字符串类,而 char[] 仍然是它在 C 中的样子,一个字节数组,代表一个字符串给你。

In terms of API and standard library everything is implemented in terms of strings and not char[], but there are still lots of functions from the libc that receive char[] so you may need to use it for those, apart from that I would always use std::string.就 API 和标准库而言,一切都是根据字符串而不是 char[] 实现的,但是 libc 中仍有许多函数接收 char[],因此您可能需要将它用于那些,除此之外我会始终使用 std::string。

In terms of efficiency of course a raw buffer of unmanaged memory will almost always be faster for lots of things, but take in account comparing strings for example, std::string has always the size to check it first, while with char[] you need to compare character by character.在效率方面,非托管内存的原始缓冲区对于很多事情来说几乎总是更快,但是考虑到比较字符串,例如,std::string 总是有大小可以先检查它,而使用 char[] 你需要逐字比较。

I personally do not see any reason why one would like to use char* or char[] except for compatibility with old code.除了与旧代码的兼容性之外,我个人看不出有任何理由想要使用 char* 或 char[]。 std::string's no slower than using a c-string, except that it will handle re-allocation for you. std::string 并不比使用 c 字符串慢,只是它会为您处理重新分配。 You can set it's size when you create it, and thus avoid re-allocation if you want.您可以在创建时设置它的大小,从而避免重新分配(如果需要)。 It's indexing operator ([]) provides constant time access (and is in every sense of the word the exact same thing as using a c-string indexer).它的索引运算符 ([]) 提供恒定时间访问(并且在任何意义上都与使用 c 字符串索引器完全相同)。 Using the at method gives you bounds checked safety as well, something you don't get with c-strings, unless you write it.使用 at 方法也可以为您提供边界检查的安全性,除非您编写它,否则您无法使用 c 字符串获得这种安全性。 Your compiler will most often optimize out the indexer use in release mode.您的编译器通常会优化发布模式下的索引器使用。 It is easy to mess around with c-strings;很容易弄乱 c 字符串; things such as delete vs delete[], exception safety, even how to reallocate a c-string.诸如删除与删除 []、异常安全,甚至如何重新分配 c 字符串之类的事情。

And when you have to deal with advanced concepts like having COW strings, and non-COW for MT etc, you will need std::string.当您必须处理高级概念(例如使用 COW 字符串和用于 MT 的非 COW 等)时,您将需要 std::string。

If you are worried about copies, as long as you use references, and const references wherever you can, you will not have any overhead due to copies, and it's the same thing as you would be doing with the c-string.如果您担心副本,只要您尽可能使用引用和常量引用,您就不会因副本而产生任何开销,这与您使用 c 字符串所做的事情相同。

Think of (char *) as string.begin().将 (char *) 视为 string.begin()。 The essential difference is that (char *) is an iterator and std::string is a container.本质区别在于 (char *) 是一个迭代器,而 std::string 是一个容器。 If you stick to basic strings a (char *) will give you what std::string::iterator does.如果您坚持使用基本字符串,则 (char *) 将为您提供 std::string::iterator 的功能。 You could use (char *) when you want the benefit of an iterator and also compatibility with C, but that's the exception and not the rule.当您想要迭代器的好处以及与 C 的兼容性时,您可以使用 (char *),但这是例外而不是规则。 As always, be careful of iterator invalidation.与往常一样,请注意迭代器失效。 When people say (char *) isn't safe this is what they mean.当人们说 (char *) 不安全时,这就是他们的意思。 It's as safe as any other C++ iterator.它与任何其他 C++ 迭代器一样安全。

Strings have helper functions and manage char arrays automatically.字符串具有辅助函数并自动管理字符数组。 You can concatenate strings, for a char array you would need to copy it to a new array, strings can change their length at runtime.您可以连接字符串,对于 char 数组,您需要将其复制到新数组中,字符串可以在运行时更改其长度。 A char array is harder to manage than a string and certain functions may only accept a string as input, requiring you to convert the array to a string.字符数组比字符串更难管理,某些函数可能只接受字符串作为输入,需要您将数组转换为字符串。 It's better to use strings, they were made so that you don't have to use arrays.最好使用字符串,它们是为了您不必使用数组而制作的。 If arrays were objectively better we wouldn't have strings.如果数组客观上更好,我们就不会有字符串。

One of the difference is Null termination (\\0).区别之一是空终止 (\\0)。

In C and C++, char* or char[] will take a pointer to a single char as a parameter and will track along the memory until a 0 memory value is reached (often called the null terminator).在 C 和 C++ 中,char* 或 char[] 将使用指向单个字符的指针作为参数,并沿着内存进行跟踪,直到达到 0 内存值(通常称为空终止符)。

C++ strings can contain embedded \\0 characters, know their length without counting. C++ 字符串可以包含嵌入的 \\0 字符,无需计算即可知道它们的长度。

#include<stdio.h>
#include<string.h>
#include<iostream>

using namespace std;

void NullTerminatedString(string str){
   int NUll_term = 3;
   str[NUll_term] = '\0';       // specific character is kept as NULL in string
   cout << str << endl <<endl <<endl;
}

void NullTerminatedChar(char *str){
   int NUll_term = 3;
   str[NUll_term] = 0;     // from specific, all the character are removed 
   cout << str << endl;
}

int main(){
  string str = "Feels Happy";
  printf("string = %s\n", str.c_str());
  printf("strlen = %d\n", strlen(str.c_str()));  
  printf("size = %d\n", str.size());  
  printf("sizeof = %d\n", sizeof(str)); // sizeof std::string class  and compiler dependent
  NullTerminatedString(str);


  char str1[12] = "Feels Happy";
  printf("char[] = %s\n", str1);
  printf("strlen = %d\n", strlen(str1));
  printf("sizeof = %d\n", sizeof(str1));    // sizeof char array
  NullTerminatedChar(str1);
  return 0;
}

Output:输出:

strlen = 11
size = 11
sizeof = 32  
Fee s Happy


strlen = 11
sizeof = 12
Fee

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM