简体   繁体   中英

2D array difference in C and C++

I have these two lines of code which I thought would compile both on C and C++.

int a[3][3] = {{10,20,30},{40,50,60},{70,80,90}};
int *p[3] = {a+0, a+1, a+2};

C compiler compiles it fine. On Visual Studio C++ compiler I get this error: error C2440: 'initializing': cannot convert from 'int (*)[3]' to 'int *'

I'm trying to understand what is the difference between these two cases.

Arrays vs. pointers is probably one of the harder topics of C (and C++ which inherited that from C). It's actually easy once you understood the concept behind but that concept might be unexpected by starters – I never saw anything similar in other programming languages.

Borgleader told in his comment: int a[3][3] decays to int* but that's false! (If it were true the issue of OP wouldn't exist.)

The trueth is:

  1. a is of type int [3][3] .
  2. a may decay to int (*)[3] ( a pointer to array 3 of int )

Hence, the definition of OP has type mismatch errors:

int *p[3] = {a+0, a+1, a+2};

The elements of p have type int* but a+0 (as well as a+1 , a+2 ) provide an expression of int (*)[3] .

This is exactly what clang tells in the Live Demo of Bob__ on Wandbox :

prog.c:7:18: warning: incompatible pointer types initializing 'int *' with an expression of type 'int (*)[3]' [-Wincompatible-pointer-types]
    int *p[3] = {a+0, a+1, a+2};
                 ^~~

Bob__ used C with -std=c11 and -pedantic .

I changed it to C++ with -std=c++17 and no -pedantic . C++ reports this as error because it's by default much stricter concerning type compatibility.


Actually, I was confused by a comment on Quora which had this example.

Considering that C has ever been quite tolerant concerning non-matching types, the example might have worked. To illustrate this, I made a slightly extended example on godbolt.org:

#include <stdio.h>

int main()
{
  int a[3][3] = {{10,20,30},{40,50,60},{70,80,90}};
  int *p[3] = { a + 0, a + 1, a + 2 };
  int *p1[3] = { *(a + 0), *(a + 1), *(a + 2) };
  int *p2[3] = { a[0], a[1], a[2] };
  return 0;
}

For int *p[3] = { a + 0, a + 1, a + 2 }; it compiled:

  mov rcx, qword ptr [rbp - 168] # load rcx with address of a
  mov qword ptr [rbp - 80], rcx  # store rcx to p[0] 
  mov rcx, qword ptr [rbp - 168] # load rcx with address of a
  add rcx, 12                    # add 12 to rcx (1 * 3 * sizeof (int))
  mov qword ptr [rbp - 72], rcx  # store rcx to p[1]
  mov rcx, qword ptr [rbp - 168] # load rcx with address of a
  add rcx, 24                    # add 24 to rcx (2 * 3 * sizeof (int))
  mov qword ptr [rbp - 64], rcx  # store rcx to p[2]

for int *p1[3] = { *(a + 0), *(a + 1), *(a + 2) }; :

  mov rcx, qword ptr [rbp - 168] # load rcx with address of a
  mov qword ptr [rbp - 112], rcx # store rcx to p1[0]  
  add rcx, 12                    # add 12 to rcx (1 * 3 * sizeof (int))
  mov qword ptr [rbp - 104], rcx # store rcx to p1[1] 
  mov rcx, qword ptr [rbp - 168] # load rcx with address of a
  add rcx, 24                    # add 24 to rcx (2 * 3 * sizeof (int))
  mov qword ptr [rbp - 96], rcx  # store rcx to p1[2]

for int *p2[3] = { a[0], a[1], a[2] }; :

  mov rcx, qword ptr [rbp - 168] # load rcx with address of a
  mov qword ptr [rbp - 144], rcx # store rcx to p2[0]  
  add rcx, 12                    # add 12 to rcx (1 * 3 * sizeof (int))
  mov qword ptr [rbp - 136], rcx # store rcx to p2[1] 
  mov rcx, qword ptr [rbp - 168] # load rcx with address of a
  add rcx, 24                    # add 24 to rcx (2 * 3 * sizeof (int))
  mov qword ptr [rbp - 128], rcx # store rcx to p2[2]

Live Demo on godbolt

Without going into too much depth, nearly the same code has been produced for all three lines. (The only differences are the addresses after mov qword ptr [rbp - ... as the initializations are stored into variables which have different addresses on stack, of course.)

It's not that surprising that *(a + 0) and a[0] result in equivalent code because according tocppreference: Subscript :

By definition, the subscript operator E1[E2] is exactly identical to *((E1)+(E2)) .

but even the initialization with pointers of wrong types didn't make a difference.

IMHO, this is good for two lessons:

  1. Using correct types by introducing the necessary dereference operators prevents warnings (in C), errors (in C++).

  2. Optimizing away dereference operators (at the cost of warnings) doesn't improve the generated binary code.


In another comment, the OP stated that

To my understanding "a" is a pointer to the whole array...

That's wrong. a is an array. It may decay to a pointer if required.

That's a difference, and it's easy to illustrate by an example:

#include <stdio.h>

void printSizes(int a[3][3], int (*p)[3])
{
  puts("when a and p passed to a function:");
  printf("sizeof a: %u\n", (unsigned)sizeof a);
  printf("sizeof p: %u\n", (unsigned)sizeof p);
}

int main()
{
  int a[3][3] = {{10,20,30},{40,50,60},{70,80,90}};
  int (*p)[3] = { a + 0, a + 1, a + 2 };
  printf("sizeof a: %u\n", (unsigned)sizeof a);
  printf("sizeof p: %u\n", (unsigned)sizeof p);
  return 0;
}

Output:

sizeof a: 36
sizeof p: 8
when a and p passed to a function:
sizeof a: 8
sizeof p: 8

Live Demo on ideone

The confusion about arrays and pointers comes probably from the fact that arrays decay in most cases to pointers. Even the subscript operator ( operator[] ) is defined for pointers but not for arrays. The sizeof operator is one of the few exceptions and shows the difference.

As arrays may not be used as arguments, there is no such difference anymore in function printSize() . Even with giving the array type the compiler uses the pointer type resulting from array decay.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM