简体   繁体   English

在C中,是数组指针还是用作指针?

[英]In C, are arrays pointers or used as pointers?

My understanding was that arrays were simply constant pointers to a sequence of values, and when you declared an array in C, you were declaring a pointer and allocating space for the sequence it points to. 我的理解是数组只是指向一系列值的常量指针,当你在C中声明一个数组时,你就是声明一个指针并为它所指向的序列分配空间。

But this confuses me: the following code: 但这让我感到困惑:以下代码:

char y[20];
char *z = y;

printf("y size is %lu\n", sizeof(y));
printf("y is %p\n", y);
printf("z size is %lu\n", sizeof(z));
printf("z is %p\n", z);

when compiled with Apple GCC gives the following result: 使用Apple GCC编译时会得到以下结果:

y size is 20
y is 0x7fff5fbff930
z size is 8
z is 0x7fff5fbff930

(my machine is 64 bit, pointers are 8 bytes long). (我的机器是64位,指针长8个字节)。

If 'y' is a constant pointer, why does it have a size of 20, like the sequence of values it points to? 如果'y'是常量指针,为什么它的大小为20,就像它指向的值序列一样? Is the variable name 'y' replaced by a memory address during compilation time whenever it is appropiate? 变量名'y'是否在编译时被内存地址替换为适当的? Are arrays, then, some sort of syntactic sugar in C that is just translated to pointer stuff when compiled? 那么数组是C中的某种语法糖,它在编译时只是转换为指针的东西吗?

Here's the exact language from the C standard ( n1256 ): 这是C标准( n1256 )的确切语言:

6.3.2.1 Lvalues, arrays, and function designators 6.3.2.1左值,数组和函数指示符
... ...
3 Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type ''array of type '' is converted to an expression with type ''pointer to type '' that points to the initial element of the array object and is not an lvalue. 3除非它是sizeof运算符或一元&运算符的操作数,或者是用于初始化数组的字符串文字,否则将类型为''array of type ''的表达式转换为类型为''指针的表达式键入 ''指向数组对象的初始元素,而不是左值。 If the array object has register storage class, the behavior is undefined. 如果数组对象具有寄存器存储类,则行为未定义。

The important thing to remember here is that there is a difference between an object (in C terms, meaning something that takes up memory) and the expression used to refer to that object. 这里要记住的重要一点是, 对象 (用C语言表示占用内存的东西)和用于引用该对象的表达式之间存在差异。

When you declare an array such as 声明一个数组时

int a[10];

the object designated by the expression a is an array (ie, a contiguous block of memory large enough to hold 10 int values), and the type of the expression a is "10-element array of int ", or int [10] . 表达式 a指定的对象是一个数组(即一个足够大的内存块,可以容纳10个int值), 表达式 a的类型是“10元素数组int ”或int [10] If the expression a appears in a context other than as the operand of the sizeof or & operators, then its type is implicitly converted to int * , and its value is the address of the first element. 如果表达式 a出现在上下文而不是sizeof&运算符的操作数中,则其类型将隐式转换为int * ,其值为第一个元素的地址。

In the case of the sizeof operator, if the operand is an expression of type T [N] , then the result is the number of bytes in the array object, not in a pointer to that object: N * sizeof T . 对于sizeof运算符,如果操作数是T [N]类型的表达式,则结果是数组对象中的字节数,而不是指向该对象的指针: N * sizeof T

In the case of the & operator, the value is the address of the array, which is the same as the address of the first element of the array, but the type of the expression is different: given the declaration T a[N]; &运算符的情况下,value是数组的地址,它与数组的第一个元素的地址相同,但表达式的类型不同:给定声明T a[N]; , the type of the expression &a is T (*)[N] , or pointer to N-element array of T. The value is the same as a or &a[0] (the address of the array is the same as the address of the first element in the array), but the difference in types matters. ,表达式的类型&aT (*)[N] ,或指向T的N元素数组的指针。该a&a[0] (数组的地址与地址相同)数组中的第一个元素),但类型的差异很重要。 For example, given the code 例如,给定代码

int a[10];
int *p = a;
int (*ap)[10] = &a;

printf("p = %p, ap = %p\n", (void *) p, (void *) ap);
p++;
ap++;
printf("p = %p, ap = %p\n", (void *) p, (void *) ap);

you'll see output on the order of 你会看到订单上的输出

p = 0xbff11e58, ap = 0xbff11e58
p = 0xbff11e5c, ap = 0xbff11e80

IOW, advancing p adds sizeof int (4) to the original value, whereas advancing ap adds 10 * sizeof int (40). IOW,推进psizeof int (4)添加到原始值,而推进ap添加10 * sizeof int (40)。

More standard language: 更标准的语言:

6.5.2.1 Array subscripting 6.5.2.1数组下标

Constraints 约束

1 One of the expressions shall have type ''pointer to object type '', the other expression shall have integer type, and the result has type '' type ''. 1其中一个表达式应具有类型''指向对象类型的指针'',另一个表达式应具有整数类型,结果具有类型'' type ''。

Semantics 语义

2 A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. 2后缀表达式后跟方括号[]的表达式是数组对象元素的下标名称。 The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))) . 下标运算符[]的定义是E1[E2](*((E1)+(E2))) Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2 -th element of E1 (counting from zero). 由于适用于binary +运算符的转换规则,如果E1是数组对象(等效地,指向数组对象的初始元素的指针)并且E2是整数,则E1[E2]指定E2E2个元素。 E1 (从零开始计数)。

Thus, when you subscript an array expression, what happens under the hood is that the offset from the address of the first element in the array is computed and the result is dereferenced. 因此,当您下标一个数组表达式时,底层发生的事情是计算数组中第一个元素的地址的偏移量并取消引用结果。 The expression 表达方式

a[i] = 10;

is equivalent to 相当于

*((a)+(i)) = 10;

which is equivalent to 这相当于

*((i)+(a)) = 10;

which is equivalent to 这相当于

 i[a] = 10;

Yes, array subscripting in C is commutative; 是的,C中的数组下标是可交换的; for the love of God, never do this in production code. 为了上帝的爱,永远不要在生产代码中这样做。

Since array subscripting is defined in terms of pointer operations, you can apply the subscript operator to expressions of pointer type as well as array type: 由于数组下标是根据指针操作定义的,因此您可以将下标运算符应用于指针类型的表达式以及数组类型:

int *p = malloc(sizeof *p * 10);
int i;
for (i = 0; i < 10; i++)
  p[i] = some_initial_value(); 

Here's a handy table to remember some of these concepts: 这是一个方便的表,以记住其中一些概念:

Declaration: T a[N];

Expression    Type    Converts to     Value
----------    ----    ------------    -----
         a    T [N]   T *             Address of the first element in a;
                                        identical to writing &a[0]
        &a    T (*)[N]                Address of the array; value is the same
                                        as above, but the type is different
  sizeof a    size_t                  Number of bytes contained in the array
                                        object (N * sizeof T)
        *a    T                       Value at a[0]
      a[i]    T                       Value at a[i]
     &a[i]    T *                     Address of a[i] 

Declaration: T a[N][M];

Expression     Type        Converts to     Value
----------     ----        ------------    -----
          a    T [N][M]    T (*)[M]        Address of the first subarray (&a[0])
         &a    T (*)[N][M]                 Address of the array (same value as
                                             above, but different type)
   sizeof a    size_t                      Number of bytes contained in the
                                             array object (N * M * sizeof T)
         *a    T [M]      T *              Value of a[0], which is the address
                                             of the first element of the first subarray
                                             (same as &a[0][0])
       a[i]    T [M]      T *              Value of a[i], which is the address
                                             of the first element of the i'th subarray
      &a[i]    T (*)[M]                    Address of the i-th subarray; same value as
                                             above, but different type
sizeof a[i]    size_t                      Number of bytes contained in the i'th subarray
                                             object (M * sizeof T)
      *a[i]    T                           Value of the first element of the i'th 
                                             subarray (a[i][0])
    a[i][j]    T                           Value at a[i][j]
   &a[i][j]    T *                         Address of a[i][j]

Declaration: T a[N][M][O];

Expression        Type             Converts to
----------        ----             -----------
         a        T [N][M][O]      T (*)[M][O]
        &a        T (*)[N][M][O]
        *a        T [M][O]         T (*)[O]
      a[i]        T [M][O]         T (*)[O]
     &a[i]        T (*)[M][O]
     *a[i]        T [O]            T *
   a[i][j]        T [O]            T *
  &a[i][j]        T (*)[O]
  *a[i][j]        T 
a[i][j][k]        T

From here, the pattern for higher-dimensional arrays should be clear. 从这里开始,高维数组的模式应该清晰。

So, in summary: arrays are not pointers. 因此,总结一下:数组不是指针。 In most contexts, array expressions are converted to pointer types. 在大多数情况下,数组表达式转换为指针类型。

Arrays are not pointers, though in most expressions an array name evaluates to a pointer to the first element of the array. 数组不是指针,但在大多数表达式中,数组名称的计算结果是指向数组第一个元素的指针。 So it is very, very easy to use an array name as a pointer. 因此,使用数组名称作为指针非常非常容易。 You will often see the term 'decay' used to describe this, as in "the array decayed to a pointer". 您经常会看到用于描述这个术语的“衰变”,如“数组衰减为指针”。

One exception is as the operand to the sizeof operator, where the result is the size of the array (in bytes, not elements). 一个例外是sizeof运算符的操作数,其结果是数组的大小(以字节为单位,而不是元素)。

A couple additional of issues related to this: 还有一些与此相关的问题:

An array parameter to a function is a fiction - the compiler really passes a plain pointer (this doesn't apply to reference-to-array parameters in C++), so you cannot determine the actual size of an array passed to a function - you must pass that information some other way (maybe using an explicit additional parameter, or using a sentinel element - like C strings do) 函数的数组参数是一个小说 - 编译器实际上传递了一个普通指针(这不适用于C ++中的引用到数组参数),因此您无法确定传递给函数的数组的实际大小 - 您必须以其他方式传递该信息(可能使用显式的附加参数,或使用sentinel元素 - 如C字符串那样)

Also, a common idiom to get the number of elements in an array is to use a macro like: 另外,获取数组中元素数量的常用习惯是使用如下宏:

#define ARRAY_SIZE(arr) ((sizeof(arr))/sizeof(arr[0]))

This has the problem of accepting either an array name, where it will work, or a pointer, where it will give a nonsense result without warning from the compiler. 这有一个接受数组名称,它将工作的地方或指针的问题,它将在没有编译器警告的情况下给出无意义的结果。 There exist safer versions of the macro (particularly for C++) that will generate a warning or error when it's used with a pointer instead of an array. 存在更安全的宏版本(特别是对于C ++),当它与指针而不是数组一起使用时会产生警告或错误。 See the following SO items: 请参阅以下SO项目:


Note: C99 VLAs (variable length arrays) might not follow all of these rules (in particular, they can be passed as parameters with the array size known by the called function). 注意:C99 VLA(可变长度数组)可能不遵循所有这些规则(特别是,它们可以作为参数传递,具有被调用函数已知的数组大小)。 I have little experience with VLAs, and as far as I know they're not widely used. 我对VLA的经验很少,据我所知,它们没有被广泛使用。 However, I do want to point out that the above discussion might apply differently to VLAs. 但是,我想指出上述讨论可能对VLA采用不同的方式。

sizeof is evaluated at compile-time, and the compiler knows whether the operand is an array or a pointer. sizeof在编译时计算,编译器知道操作数是数组还是指针。 For arrays it gives the number of bytes occupied by the array. 对于数组,它给出了数组占用的字节数。 Your array is a char[] (and sizeof(char) is 1), thus sizeof happens to give you the number of elements. 你的数组是char[]sizeof(char)是1),因此sizeof恰好给你元素的数量。 To get the number of elements in the general case, a common idiom is (here for int ): 为了获得一般情况下的元素数量,常见的习语是(这里是int ):

int y[20];
printf("number of elements in y is %lu\n", sizeof(y) / sizeof(int));

For pointers sizeof gives the number of bytes occupied by the raw pointer type. 对于指针, sizeof给出了原始指针类型占用的字节数。

除了其他人所说的,也许这篇文章可以帮助: http//en.wikipedia.org/wiki/C_%28programming_language%29#Array-pointer_interchangeability

In

char hello[] = "hello there"
int i;

and

char* hello = "hello there";
int i;

In the first instance (discounting alignment) 12 bytes will be stored for hello with the allocated space initialised to hello there while in the second hello there is stored elsewhere (possibly static space) and hello is initialised to point to the given string. 在第一个实例中(折扣对齐)将为hello存储12个字节,其中已分配的空间初始化为hello,而在第二个hello中存储其他地方(可能是静态空间)并且hello初始化为指向给定的字符串。

hello[2] as well as *(hello + 2) will return 'e' in both instances however. 然而, hello[2]以及*(hello + 2)将在两个实例中返回'e'。

If 'y' is a constant pointer, why does it have a size of 20, like the sequence of values it points to? 如果'y'是常量指针,为什么它的大小为20,就像它指向的值序列一样?

Because z is the address of the variable, and will always return 8 for your machine. 因为z是变量的地址,并且将始终为您的机器返回8。 You need to use the dereference pointer (&) in order to get the contents of a variable. 您需要使用解除引用指针(&)以获取变量的内容。

EDIT: A good distinction between the two: http://www.cs.cf.ac.uk/Dave/C/node10.html 编辑:两者之间的一个很好的区别: http//www.cs.cf.ac.uk/Dave/C/node10.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM