简体   繁体   English

为什么C中的这个简单程序崩溃(数组VS指针)

[英]Why this simple program in C crashes (array VS pointer)

I have two files: 我有两个文件:

In file 1.c I have the following array: 在文件1.c中,我有以下数组:

char p[] = "abcdefg";

In file 0.c I have the following code: 在文件0.c中,我有以下代码:

#include <stdio.h>

extern char *p; /* declared as 'char p[] = "abcdefg";' in 1.c file */

int main()
{
    printf("%c\n", p[3]);   /* crash */
    return 0;
}

And this is the command line: 这是命令行:

gcc  -Wall -Wextra     0.c  1.c

I know that extern char *p should've been: extern char p[]; 我知道extern char *p应该是: extern char p[]; , but I just want an explanation of why it doesn't work in this particular case . ,但我只想解释为什么它在这种特殊情况下不起作用 While it works here: 虽然它在这里工作:

int main()
{
    char a[] = "abcdefg";
    char *p = a;

    printf("%c\n", p[3]);   /* d */
    return 0;
}

Your two examples are not comparable. 你的两个例子没有可比性。

In your second example, you have 在你的第二个例子中,你有

char a[] = "abcdefg";
char *p = a;

So a is an array, and p is a pointer. 所以a是一个数组, p是一个指针。 Drawing that in pictures, it looks like 在图片中绘制它看起来像

      +---+---+---+---+---+---+---+---+
   a: | a | b | c | d | e | f | g | \0|
      +---+---+---+---+---+---+---+---+
        ^
        |
   +----|----+
p: |    *    |
   +---------+

And this is all fine; 这一切都很好; no problems with that code. 该代码没有问题。

But in your first example, in file 1.c you define an array named p : 但在第一个示例中,在文件1.c定义了一个名为p的数组:

   +---+---+---+---+---+---+---+---+
p: | a | b | c | d | e | f | g | \0|
   +---+---+---+---+---+---+---+---+

You can name an array " p " if you want (the compiler certainly doesn't care), but then, over in file 0.c , you change your mind and declare that p is a pointer. 你可以根据需要命名一个数组“ p ”(编译器当然不关心),但是,在文件0.c ,你改变主意并声明p是一个指针。 You also declare (with the " extern " keyword) that p is defined somewhere else. 您还声明(使用“ extern ”关键字) p在其他位置定义。 So the compiler takes your word for it, and emits code that goes to location p and expects to find a pointer there -- or, in pictures, it expects to find a box, containing an arrow, that points somewhere else. 所以编译器会接受你的话,并发出代码转到位置p并希望在那里找到一个指针 - 或者,在图片中,它希望找到一个包含箭头的框,指向其他地方。 But what it actually finds there is your string "abcdefg" , only it doesn't realize it. 但它实际上发现了你的字符串"abcdefg" ,只是它没有意识到它。 It will probably end up trying to interpret the bytes 0x61 0x62 0x63 0x64 (that is, the bytes making up the first part of the string "abcdefg" ) as a pointer. 它可能最终会尝试将字节0x61 0x62 0x63 0x64 (即构成字符串"abcdefg"的第一部分的字节)解释为指针。 Obviously that doesn't work. 显然这不起作用。

You can see this clearly if you change the printf call in 0.c to 如果将0.cprintf调用更改为,则可以清楚地看到这一点

printf("%p\n", p);

This prints the value of the pointer p as a pointer . 这将指针p的值打印为指针 (Well, of course, p isn't really a pointer, but you lied to the compiler and told it that it was, so what you'll see is the result when the compiler treats it as a pointer, which is what we're trying to understand here.) On my system this prints (好吧,当然, p实际上不是一个指针,但是你对编译器说谎并且告诉它它是,所以当编译器将它视为指针时,你会看到结果,这就是我们'重新尝试理解这里。)在我的系统上打印

0x67666564636261

That's all 8 bytes of the string "abcdefg\\0" , in reverse order. 这是字符串"abcdefg\\0"的所有8个字节,顺序相反。 (From this we can infer that I'm on a machine which (a) uses 64-bit pointers and (b) is little-endian.) So if I tried to print (从这里我们可以推断出我在一台机器上,(a)使用64位指针而(b)是小端。)所以如果我试图打印

printf("%c\n", p[3]);

it would try to fetch a character from location 0x67666564636264 (that is, 0x67666564636261 + 3) and print it. 它会尝试从位置0x67666564636264 (即0x67666564636261 + 3)获取一个字符并打印出来。 Now, my machine has a fair amount of memory, but it doesn't have that much, so location 0x67666564636264 doesn't exist, and therefore the program crashes when it tries to fetch from there. 现在,我的机器有相当多的内存,但它没有那么多,所以位置0x67666564636264不存在,因此程序在尝试从那里获取时崩溃。

Two more things. 还有两件事。

If arrays are not the same as pointers, how did you get away with saying 如果数组与指针不同,你是怎么说的

char *p = a;

in your second example, the one I said was "all fine; no problems"? 在你的第二个例子中,我说的那个“一切都很好;没有问题”? How can you assign an array on the right-hand side to a pointer on the left? 如何将右侧的数组分配给左侧的指针? The answer is the famous (infamous?) "equivalence between arrays and pointers in C": what actually happens is just as if you had said 答案是着名的(臭名昭着的?)“C中数组和指针之间的等价”:实际发生的事情就像你说过的那样

char *p = &a[0];

Whenever you use an array in an expression, what you get is actually a pointer to the array's first element, just as I showed in the first picture in this answer. 无论何时在表达式中使用数组,您获得的实际上是指向数组第一个元素的指针,就像我在本答案的第一张图片中所示。

And when you asked, "why it doesn't work, while it works here?", there were two other ways you could have asked it. 当你问,“为什么它不起作用,虽然它在这里工作?”,还有另外两种方法你可以问它。 Suppose we have the two functions 假设我们有两个功能

void print_char_pointer(char *p)
{
    printf("%s\n", p);
}

void print_char_array(char a[])
{
    printf("%s\n", a);
}

And then suppose we go back to your second example, with 然后假设我们回到你的第二个例子

char a[] = "abcdefg";
char *p = a;

and suppose that we call 并假设我们打电话

print_char_pointer(a);

or 要么

print_char_array(p);

If you try it, you'll find that there are no problems with either of them. 如果您尝试它,您会发现它们中的任何一个都没有问题。 But how can this be? 但这怎么可能呢? How can we pass an array to a function that expects a pointer, when we call print_char_pointer(a) ? 当我们调用print_char_pointer(a)时,如何将数组传递给需要指针的函数? And how can we pass a pointer to a function that expects an array, when we call print_char_array(p) ? 当我们调用print_char_array(p)时,如何将指针传递给需要数组的函数?

Well, remember, whenever we mention an array in an expression, what we get is a pointer to the array's first element. 好吧,记住,每当我们在表达式中提到数组时,我们得到的是指向数组第一个元素的指针。 So when we call 所以当我们打电话时

print_char_pointer(a);

what we get is just as if we had written 我们得到的就像我们写的一样

print_char_pointer(&a[0]);

What actually gets passed to the function is a pointer, which is what the function expects, so we're fine. 实际传递给函数的是一个指针,这是函数所期望的,所以我们没问题。

But what about the other case, where we pass a pointer to a function that's declared as if it accepts an array? 但是另一种情况呢,我们传递指向一个被声明为接受数组的函数的指针? Well, there's actually another tenet to the "equivalence between arrays and pointers in C". 嗯,实际上还有另一个原则是“C中数组和指针之间的等价”。 When we wrote 我们写的时候

void print_char_array(char a[])

the compile treated it just as if we had written 编译对待它就好像我们写了一样

void print_char_array(char *a)

Why would the compiler do such a thing? 为什么编译器会做这样的事情? Why, because it knows that no array will ever be passed to a function, so it knows that no function will actually ever receive an array, so it knows that the function will receive a pointer instead. 为什么,因为它知道没有数组将被传递给一个函数,所以它知道没有函数实际上会接收一个数组,因此它知道该函数将接收一个指针。 So that's the way the compiler treats it. 这就是编译器对待它的方式。

(And, to be very clear, when we talk about the "equivalence between arrays and pointers in C", we are not saying that pointers and arrays are equivalent, just that there is this special equivalence relationship between them. I've mentioned two of the tenets of that equivalence already. Here are all three of them, for reference: (1) Whenever you mention the name of an array in an expression, what you automatically get is a pointer to the array's first element. (2) Whenever you declare a function that seems to accept an array, what it actually accepts is a pointer. (3) Whenever you use the "array" subscripting operator, [] , on a pointer, as in p[i] , what you actually get is just as if you had written *(p + i) . And, in fact, if you think about it carefully, due to tenet (1), even when you use the array subscripting operator on something that looks like an array, you're actually using it on a pointer. But that's a pretty strange notion, which you don't have to worry about if you don't w (并且,非常清楚,当我们谈论“C中的数组和指针之间的等价”时,我们并不是说指针和数组等价的,只是它们之间存在这种特殊的等价关系。我已经提到了两个已经完成了这三个等价的原则。以下是三个,供参考:(1)每当你在表达式中提到数组的名称时,你自动获得的是指向数组第一个元素的指针。(2)每当你声明一个似乎接受一个数组的函数,它实际上接受的是一个指针。(3)每当你在指针上使用“数组”下标运算符[] ,如p[i] ,你实际得到的是什么就像你写过*(p + i) 。事实上,如果你仔细考虑它,由于tenet(1),即使你在一个看起来像数组的东西上使用数组下标运算符,你实际上是在指针上使用它。但这是一个非常奇怪的概念,如果你不这样做,你不必担心 ant to, because it just works.) 蚂蚁,因为它只是有效。)

Because arrays are not pointers. 因为数组不是指针。 You tell the program "elsewhere I have a char pointer", but you actually don't have one - you have an array. 你告诉程序“别处我有一个字符指针”,但你实际上没有一个 - 你有一个数组。

An array will decay into a pointer when used in an expression, but that doesn't mean that an array is a pointer. 当在表达式中使用时,数组将衰减为指针,但这并不意味着数组指针。 For more info see Is an array name a pointer? 有关更多信息,请参阅数组名称是否为指针? .

In your second example you have both an array and a pointer, two separate variables, so it is a different case. 在你的第二个例子中,你有一个数组一个指针,两个独立的变量,所以它是一个不同的情况。

Let me explain it in reverse: 让我反过来解释一下:

In the second case, you have an array and then a pointer which points to that array. 在第二种情况下,您有一个数组,然后是一个指向该数组的指针。

Accessing via the pointer involves an indirect memory address ("print the 3rd byte from where this pointer points to" vs. "print the 3rd byte of this array"). 通过指针访问涉及间接存储器地址(“打印该指针指向的第3个字节”与“打印该数组的第3个字节”)。

In the first case, you have an array somewhere else, but tell the compiler you had a pointer at that place. 在第一种情况下,你在其他地方有一个数组,但告诉编译器你在那个地方有一个指针。 So it tries to read that pointer and read the data from where it points to. 因此它尝试读取该指针并从其指向的位置读取数据。 But there is no pointer – there is the data immediately, so the pointer points to "anywhere and nowhere" (at least, quite likely). 但是没有指针 - 立即有数据,因此指针指向“任何地方,无处可去”(至少很可能)。 This constitutes undefined behaviour (often abbreviated as UB). 这构成了未定义的行为(通常缩写为UB)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM