简体   繁体   中英

Two string literals have the same pointer value?

When I run this program using MinGW, im getting output as "="

#include<iostream>

using namespace std;

int main()
{
 char *str1 = "Hello";
 char *str2 = "Hello";

 if(str1==str2)
 cout<<"=";
 else
 cout<<"!=";


 return 0;
}

However, logically, it should be !=, coz these are pointers and they are pointing to different memory locations. When I run this code in my Turbo C++, i get !=

You are right in that they are pointers. However, whether they are pointing to different locations or not depends on the implementation. It is perfectly valid for a compiler to store a string literal just once and use its address wherever it's used in code.

There are no guarantees that the two pointers are pointing to different memory locations. Maybe it is because optimizations, or the compiler uses its own rules... the behavior is "Implementation Defined".

According to the standard (C++11 §2.14.5 String Literals):

Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation defined.

This is an expected result. You can verify this by looking at the underlying assembly. For example, if I build with:

g++ -S ptr.c

then you can see the following in the file output (ptr.s):

        .file   "ptr.c"
        .def    ___main;        .scl    2;      .type   32;     .endef
        .section .rdata,"dr"
LC0:
        .ascii "Hello\0"               ; Note - "Hello" only appears once in
                                       ; this data section!
LC1:
        .ascii "=\0"
LC2:
        .ascii "!=\0"
        .text
.globl _main
        .def    _main;  .scl    2;      .type   32;     .endef
_main:
        [... some stuff deleted for brevity ...]
LCFI5:
        call    ___main
        movl    $LC0, -12(%ebp)        ; This sets str1
        movl    $LC0, -8(%ebp)         ; This sets str2
        movl    -12(%ebp), %eax

I've commented the two key bits -- only one appearance of 'Hello' is in the rdata section of the underlying code, and you can see str1 and str2 are set towards the end, both pointing to the same label: LC0 . This is beacuse 'Hello' is a string literal and, importantly, is constant .

As others have pointed out - this is perfectly legal under the standards.

The type of a string literal like "Hello" is array of const char , therefore, you are directing two pointers to something that is not allowed to ever change.

The C++ standard gives compilers the freedom to merge identical constant values together (note that compilers are not required to do so).

Related: The declarations are therefore invalid and must be modified to:

const char *str1 = "Hello";
const char *str2 = "Hello";

or if you want

char const *str1 = "Hello";
char const *str2 = "Hello";

which reads nicely when reading right-to-left:

str1 is a pointer to const char

.

char *str1 = "Hello"; -- this line, while allowed (by many compilers), is a bad idea to actually do. It is basically only permitted for backward compatibility with C, and actually writing to *str1 results in undefined behavior. I would recommend finding the compiler setting that gives you warnings when you do this, and if your compiler lacks such warnings finding a new compiler.

The C++ standard gives compilers and execution environments ridiculous amounts of freedom about where "String literals" are stored. They could literally use a pointer to the "literal" part of "String literals" as the pointer value for "literal" , and storing them in memory in which you'd segfault when you attempt to edit them is not unexpected.

Note that char buf1[] = "Hello"; does something fundamentally different than char* str1 = "Hello"; : it actually initializes the buffer buf1 with the characters {'H','e','l','l','o','\\0'} .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM