简体   繁体   English

两个字符串文字具有相同的指针值?

[英]Two string literals have the same pointer value?

When I run this program using MinGW, im getting output as "=" 当我使用MinGW运行此程序时,我输出为“=”

#include<iostream>

using namespace std;

int main()
{
 char *str1 = "Hello";
 char *str2 = "Hello";

 if(str1==str2)
 cout<<"=";
 else
 cout<<"!=";


 return 0;
}

However, logically, it should be !=, coz these are pointers and they are pointing to different memory locations. 但是,从逻辑上讲,它应该是!=,因为这些是指针,它们指向不同的内存位置。 When I run this code in my Turbo C++, i get != 当我在Turbo C ++中运行此代码时,我得到了!=

You are right in that they are pointers. 你是对的,他们是指针。 However, whether they are pointing to different locations or not depends on the implementation. 但是,它们是否指向不同的位置取决于实施。 It is perfectly valid for a compiler to store a string literal just once and use its address wherever it's used in code. 编译器只存储一次字符串文字并在代码中使用它的地址时使用它的地址是完全有效的。

There are no guarantees that the two pointers are pointing to different memory locations. 无法保证两个指针指向不同的内存位置。 Maybe it is because optimizations, or the compiler uses its own rules... the behavior is "Implementation Defined". 也许是因为优化,或者编译器使用自己的规则......行为是“实现定义”。

According to the standard (C++11 §2.14.5 String Literals): 根据标准(C ++11§2.14.5字符串文字):

Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation defined. 是否所有字符串文字都是不同的(即存储在非重叠对象中)是实现定义的。

This is an expected result. 这是预期的结果。 You can verify this by looking at the underlying assembly. 您可以通过查看基础程序集来验证这一点。 For example, if I build with: 例如,如果我构建:

g++ -S ptr.c

then you can see the following in the file output (ptr.s): 然后你可以在文件输出(ptr.s)中看到以下内容:

        .file   "ptr.c"
        .def    ___main;        .scl    2;      .type   32;     .endef
        .section .rdata,"dr"
LC0:
        .ascii "Hello\0"               ; Note - "Hello" only appears once in
                                       ; this data section!
LC1:
        .ascii "=\0"
LC2:
        .ascii "!=\0"
        .text
.globl _main
        .def    _main;  .scl    2;      .type   32;     .endef
_main:
        [... some stuff deleted for brevity ...]
LCFI5:
        call    ___main
        movl    $LC0, -12(%ebp)        ; This sets str1
        movl    $LC0, -8(%ebp)         ; This sets str2
        movl    -12(%ebp), %eax

I've commented the two key bits -- only one appearance of 'Hello' is in the rdata section of the underlying code, and you can see str1 and str2 are set towards the end, both pointing to the same label: LC0 . 我评论了两个关键位 - 在底层代码的rdata部分只有一个'Hello'的外观,你可以看到str1和str2都设置在最后,两者都指向同一个标签: LC0 This is beacuse 'Hello' is a string literal and, importantly, is constant . 这是因为'Hello'是一个字符串文字,重要的是,它是不变的

As others have pointed out - this is perfectly legal under the standards. 正如其他人所指出的那样 - 根据标准,这是完全合法的。

The type of a string literal like "Hello" is array of const char , therefore, you are directing two pointers to something that is not allowed to ever change. "Hello"这样的字符串文字的类型是const char数组 ,因此,您指向两个指针,指向不允许更改的内容。

The C++ standard gives compilers the freedom to merge identical constant values together (note that compilers are not required to do so). C ++标准使编译器可以自由地将相同的常量值合并在一起(请注意,编译器不需要这样做)。

Related: The declarations are therefore invalid and must be modified to: 相关:因此声明无效,必须修改为:

const char *str1 = "Hello";
const char *str2 = "Hello";

or if you want 或者如果你想

char const *str1 = "Hello";
char const *str2 = "Hello";

which reads nicely when reading right-to-left: 从右到左阅读时读得很好:

str1 is a pointer to const char

.

char *str1 = "Hello"; -- this line, while allowed (by many compilers), is a bad idea to actually do. - 这条线,虽然允许(由许多编译器),实际上是一个坏主意。 It is basically only permitted for backward compatibility with C, and actually writing to *str1 results in undefined behavior. 它基本上只允许与C向后兼容,实际写入* str1会导致未定义的行为。 I would recommend finding the compiler setting that gives you warnings when you do this, and if your compiler lacks such warnings finding a new compiler. 我建议找到编译器设置,当你这样做时会给你警告,如果你的编译器没有这样的警告找到一个新的编译器。

The C++ standard gives compilers and execution environments ridiculous amounts of freedom about where "String literals" are stored. C ++标准为编译器和执行环境提供了关于"String literals"存储位置的荒谬自由。 They could literally use a pointer to the "literal" part of "String literals" as the pointer value for "literal" , and storing them in memory in which you'd segfault when you attempt to edit them is not unexpected. 他们可以直接使用指向"String literals""literal"部分的指针作为"literal"的指针值,并将它们存储在您尝试编辑它们时会出现段错误的内存中并不意外。

Note that char buf1[] = "Hello"; 注意char buf1[] = "Hello"; does something fundamentally different than char* str1 = "Hello"; 做一些与char* str1 = "Hello";根本不同的事情char* str1 = "Hello"; : it actually initializes the buffer buf1 with the characters {'H','e','l','l','o','\\0'} . :它实际上用字符{'H','e','l','l','o','\\0'}初始化缓冲区buf1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM