简体   繁体   中英

String Literal Differences Between C and C++

As far as I can tell, before C++11, string literals were handled in almost exactly the same way between C and C++.

Now, I acknowledge that there are differences between C and C++ in the handling of wide string literals.

The only differences that I have been able to find are in the initialization of an array by string literal.

char str[3] = "abc"; /* OK in C but not in C++ */
char str[4] = "abc"; /* OK in C and in C++. Terminating zero at str[3] */

And a technical difference that only matters in C++. In C++ "abc" is const char [4] while in C it is char [4] . However, C++ has a special rule that allows the conversion to const char * and then to char * to retain C compatibility up until C++11 when that special rule is no longer applied.

And a difference in allowed lengths of literals. However, as a practical matter any compiler that compiles both C and C++ code will not enforce the lower C limit.

I have some interesting links that apply:

Are there any other differences?

Raw strings

A noticeable difference is that C++'s string literals are a superset of C ones. Specifically C++ now supports raw strings ( not supported in C), technically defined at §2.14.15 and generally used in HTML and XML where " is often encountered.

Raw strings allow you to specify your own delimiter (up to 16 characters) in the form:

R"delimiter(char sequence)delimiter"

This is particularly useful to avoid unnecessary escaping characters by providing your own string delimiter. The following two examples show how you can avoid escaping of " and ( respectively:

std::cout << R"(a"b"c")";      // empty delimiter
std::cout << '\n';
std::cout << R"aa(a("b"))aa";  // aa delimiter
// a"b"c"
// a("b")

Live demo


char vs const char

Another difference, pointed out in the comments, is that string literals have type char [n] in C, as specified at §6.4.5/6:

For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence.

while in C++ they have type const char [n] , as defined in §2.14.5/8:

Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration (3.7).

This doesn't change the fact that in both standard (at §6.4.5/7 and 2.14.5/13 for C and C++ respectively) attempting to modify a string literal results in undefined behavior.


Unspecified vs Implementation defined ( ref )

Another subtle difference is that in C, wether the character arrays of string literals are different is unspecified, as per §6.4.5/7:

It is unspecified whether these arrays are distinct provided their elements have the appropriate values.

while in C++ this is implementation defined, as per §2.14.5/13:

Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation- defined.

The best way to answer your question is to rewrite it as a Program that compiles identically when using a "C" or "C++" Compiler, I will assume you are using GCC but other ( correctly written ) Compiler Toolchains should provide similar results.

First I will address each point you posed then I will give a Program that provides the answer (and Proof).

  • As far as I can tell, before C++11, string literals were handled in almost exactly the same way between C and C++.

They still can be handled the same way using various Command Line Parameters, in this example I will use "-fpermissive" (a Cheat). You are better off finding out why you are getting Warnings and writing NEW Code to avoid ANY Warning; only use CLP 'cheats' to compile OLD Code.

Write new Code correctly (no cheats and no Warnings, that there be no Errors goes without saying).

  • Now, I acknowledge that there are differences between C and C++ in the handling of wide string literals.

There does not have to be (many differences) since you can cheat most or all of them away depending on the circumstances. Cheating is wrong, learn to program correctly and follow modern Standards not the mistakes (or awkwardness) of the past. Things are done a certain way to be helpful both to you, and to the Compiler in some cases (remember YOU are not the only one who 'sees' your Code).

In this case the Compiler wants enough space allocated to terminate the String with a '0' (zero byte). That permits the use of a print (and some other) Function without specifying the length of the String.

IF you are simply trying to compile an existing Program you obtained from somewhere and do not want to re-write it, you simply want to compile it and run it, then use the cheats (if you must) to get past the Warnings and force the compilation to an executable.

  • The rest of what you wrote ...

No.

.

See this example Program. I slightly modified your question to make it into a Program. The result of compiling this Program with a "C" or C++" Compiler is identical.

Copy-and-Paste the example Program text below to a File called "test.c", then follow the instructions at the start. Simply 'cat' the File so you can backscroll it (and see it) without opening a Text Editor, then Copy-and-Paste each Line beginning with the Compiler Commands (the next three).

Note, that as pointed out in the Comments, that running this Line "g++ -S -o test_c++.s test.c" produces an Error (using a modern g++ Compiler) since the container is not long enough to hold the String.

You should be able to read this Program and not actually need to compile it to see the Answer but it will compile and produce the Output for you to examine should you desire to do so.

As you can see the Varable "str1" is not long enough to hold the String when it is null terminated, that produces an Error on a modern (and correctly written) g++ Compiler.


/* Answer for: http://stackoverflow.com/questions/23145793/string-literal-differences-between-c-and-c
 *
 * cat test.c
 * gcc -S -o test_c.s test.c
 * g++ -S -o test_c++.s test.c
 * g++ -S -fpermissive -o test_c++.s test.c
 *
 */

char str1[3] = "1ab";
char str2[4] = "2ab";
char str3[]  = "3ab";

main(){return 0;}


/* Comment: Executing "g++ -S -o test_c++.s test.c" produces this Error:
 *
 * test.c:10:16: error: initializer-string for array of chars is too long [-fpermissive]
 * char str1[3] = "1ab";
 *                ^
 *
 */


/* Resulting Assembly Language Output */

/*      .file   "test.c"
 *      .globl  _str1
 *      .data
 * _str1:
 *      .ascii "1ab"
 *      .globl  _str2
 * _str2:
 *      .ascii "2ab\0"
 *      .globl  _str3
 * _str3:
 *      .ascii "3ab\0"
 *      .def    ___main;    .scl    2;  .type   32; .endef
 *      .text
 *      .globl  _main
 *      .def    _main;  .scl    2;  .type   32; .endef
 * _main:
 * LFB0:
 *      .cfi_startproc
 *      pushl   %ebp
 *      .cfi_def_cfa_offset 8
 *      .cfi_offset 5, -8
 *      movl    %esp, %ebp
 *      .cfi_def_cfa_register 5
 *      andl    $-16, %esp
 *      call    ___main
 *      movl    $0, %eax
 *      leave
 *      .cfi_restore 5
 *      .cfi_def_cfa 4, 4
 *      ret
 *      .cfi_endproc
 * LFE0:
 *      .ident  "GCC: (GNU) 4.8.2"
 *
 */

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM