简体   繁体   中英

Pointers in C with typecasting

#include<stdio.h> 
int main() 
{ 
    int a; 
    char *x; 
    x = (char *) &a; 
    a = 512; 
    x[0] = 1; 
    x[1] = 2; 
    printf("%d\n",a); 
    return 0; 
}

I'm not able to grasp the fact that how the output is 513 or even Machine dependent ? I can sense that typecasting is playing a major role but what is happening behind the scenes, can someone help me visualise this problem ?

The int a is stored in memory as 4 bytes. The number 512 is represented on your machine as:

0 2 0 0

When you assign to x[0] and x[1] , it changes this to:

1 2 0 0

which is the number 513 .

This is machine-dependent, because the order of bytes in a multi-byte number is not specified by the C language.

For simplifying assume the following:

  • size of int is 4 (in bytes)
  • size of any pointer type is 8
  • size of char is 1 byte

in line 3 x is referencing a as a char, this means that x thinks that he is pointing to a char (he has no idea that a was actually a int.

line 4 is meant to confuse you. Don't.

line 5 - since x thinks he is pointing to a char x[0] = 1 changes just the first byte of a (because he thinks that he is a char)

line 6 - once again, x changed just the second byte of a.

note that the values put in lines 5 and 6 overide the value in line 4. the value of a is now 0...0000 0010 0000 0001 (513).

Now when we print a as an int, all 4 bytes would be considered as expected.

I'm not able to grasp the fact that how the output is 513 or even Machine dependent

The output is implementation-defined. It depends on the order of bytes in CPU's interpretation of integers, commonly known as endianness .

I can sense that typecasting is playing a major role

The code reinterprets the value of a , which is an int , as an array of bytes. It uses two initial bytes, which is guaranteed to work, because an int is at least two bytes in size.

Can someone help me visualise this problem?

An int consists of multiple bytes. They can be addressed as one unit that represents an integer, but they can also be addressed as a collection of bytes. The value of an int depends on the number of bytes that you set, and on the order of these bytes in CPU's interpretation of integers.

It looks like your system stores the least significant byte at a lowest address, so the result of storing 1 and 2 at offsets zero and one produces this layout:

Byte 0 Byte 1 Byte 2 Byte 3
------ ------ ------ ------
     1      2      0      0

Integer value can be computed as follows:

1 + 2*256 + 0*65536 + 0*16777216

By taking x , which is a char * , and pointing it to the address of a , which is an int , you can use x to modify the individual bytes that represent a .

The output you're seeing suggests that an int is stored in little-endian format, meaning the least significant byte comes first. This can change however if you run this code on a different system (ex. a Sun SPARC machine which is big-enidan).

You first set a to 512. In hex, that's 0x200 . So the memory for a , assuming a 32 bit int in little endian format, is laid out as follows:

-----------------------------
| 0x00 | 0x02 | 0x00 | 0x00 |
-----------------------------

Next you set x[0] to 1, which updates the first byte in the representation of a (in this case leaving it unchanged):

-----------------------------
| 0x01 | 0x02 | 0x00 | 0x00 |
-----------------------------

Then you set x[1] to 2, which updates the second byte in the representation of a :

-----------------------------
| 0x01 | 0x02 | 0x00 | 0x00 |
-----------------------------

Now a has a value of 0x201, which in decimal is 513.

Let me try to break this down for you in addition to the previous answers:

#include<stdio.h> 
int main() 
{ 
    int a;            //declares an integer called a
    char *x;          //declares a pointer to a character called x
    x = (char *) &a;  //points x to the first byte of a
    a = 512;          //writes 512 to the int variable
    x[0] = 1;         //writes 1 to the first byte
    x[1] = 2;         //writes 2 to the second byte
    printf("%d\n",a); //prints the integer
    return 0; 
}

Note that I wrote first byte and second byte. Depending on the byte order of your platform and the size of an integer you might not get the same results.

Lets look at the memory for 32bit or 4 Bytes sized integers:

Little endian systems

first byte | second byte | third byte | forth byte
0x00           0x02        0x00         0x00

Now assigning 1 to the first byte and 2 to the second one leaves us with this:

first byte | second byte | third byte | forth byte
0x01           0x02        0x00         0x00

Notice that the first byte gets changed to 0x01 while the second was already 0x02 . This new number in memory is equivalent to 513 on little endian systems.

Big endian systems

Lets look at what would happen if you were trying this on a big endian platform:

first byte | second byte | third byte | forth byte
0x00           0x00        0x02         0x00

This time assigning 1 to the first byte and 2 to the second one leaves us with this:

first byte | second byte | third byte | forth byte
0x01           0x02        0x02         0x00

Which is equivalent to 16,908,800 as an integer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM