简体   繁体   English

当我们将char *转换为int *时,在后台或内存中会发生什么

[英]What happens in background or in memory when we cast char * to int *

I am learning type casting of pointers and randomly comes to this program 我正在学习指针的类型转换,并随机进入该程序

#include <stdio.h>
main() { 
  char* p="01234567890123456789";
  int *pp = (int *)p;              
  printf("%d",pp[0]);
}

On executing above program , output is 858927408 What are these randome numbers and from where they come ? 执行上述程序时,输出为858927408这些随机数是什么,它们从何而来? What's happening in background or in memory ? 后台或内存中发生了什么?

Edit : And if i write printf("%c",pp[0]); 编辑:如果我写printf("%c",pp[0]); then output is 0 which is correct but when I change pp[0] to pp[1] then output is 4 but how ? 然后输出是0 ,这是正确的,但是当我将pp[0]更改为pp[1]输出是4但是如何?

If you express the result in hexadecimal (%x), you can see that: 如果以十六进制(%x)表示结果,则可以看到:

858927408 = 0x33323130
  • 0x33 is the ascii code for '3' 0x33'3'的ASCII码
  • 0x32 is the ascii code for '2' 0x32'2'的ASCII码
  • 0x31 is the ascii code for '1' 0x31'1'的ASCII码
  • 0x30 is the ascii code for '0' 0x30'0'的ASCII码

So you just display the memory storing 0123456... But since your processor is little endian , you see the codes inverted. 因此,您只需要显示存储0123456...的内存即可0123456...但是由于您的处理器为低端字节序 ,因此您会看到代码反转。

In memory, you have (in hexa) 在内存中,您拥有(六进制)

30 31 32 33 34 35 36 37 38   # 0 1 2 3 4 5 6 7 8
39 30 31 32 33 34 35 36 37   # 9 0 1 2 3 4 5 6 7
38 39 00                     # 8 9\0  

In the printf("%d...") , you read the 4 first bytes as a little endian integer, So it display the result of 0x33*0x1000000 + 0x32*0x10000 +0x31*0x100 +0x30 printf("%d...") ,您将第4个字节读取为一个小端字节整数,因此它将显示结果0x33*0x1000000 + 0x32*0x10000 +0x31*0x100 +0x30


With %c , things are different: 使用%c ,情况有所不同:

If you write printf("%c", pp[0]) , you will try to print ONE character from 0x33323130 , so 0x30 is retain (in your case, might be an UB in some cases, I'm not sure) so it display "0" which ascii code is 0x30 如果您编写printf("%c", pp[0]) ,则将尝试从0x33323130打印一个字符,因此保留0x30 (在您的情况下,在某些情况下可能是UB,我不确定)它显示“ 0”,ASCII码为0x30

If you write printf("%c", pp[1]) , you will try to print ONE character from 0x37363534 , so 0x34 is retain so it display "4" which ascii code is 0x34 如果您写printf("%c", pp[1]) ,您将尝试从0x37363534打印一个字符,因此保留0x34 ,因此显示“ 4”,ASCII码为0x34

  1. If your C implementation uses ASCII, then the first four bytes of the string "01234567890123456789" are 48, 49, 50, and 51 (hexadecimal 0x30, 0x31, 0x32, and 0x33), which are the ASCII codes for the characters “0”, “1”, “2”, and “3”. 如果您的C实现使用ASCII,则字符串"01234567890123456789"的前四个字节为"01234567890123456789"和51(十六进制0x30、0x31、0x32和0x33),这是字符“ 0”的ASCII码。 ,“ 1”,“ 2”和“ 3”。
  2. (int *)p converts p from char * to int * . (int *)ppchar *转换为int * Pointer conversions are not fully defined by the C standard. 指针转换未完全由C标准定义。 See the notes below. 请参阅下面的注释。 If there is no alignment problem, in most C implementations, the result of this conversion will point to the same place that p points to. 如果没有对齐问题,则在大多数C实现中,此转换的结果将指向p指向的相同位置。
  3. Having set pp to (int *)p , pp[0] fetches the bytes at pp and interprets them as an int . pp设置为(int *)ppp[0]提取pp处的字节并将其解释为int In your implementation, int objects have four bytes, and bytes are ordered with the least significant byte in the lowest-addressed memory. 在您的实现中, int对象有四个字节,并且字节在最低寻址的内存中以最低有效字节排序。 So the bytes 0x30, 0x31, 0x32, and 0x33 are read from memory and formed into an integer 0x33323130 (decimal 858927408). 因此,从内存读取字节0x30、0x31、0x32和0x33,并将其形成为整数0x33323130(十进制858927408)。

Notes About Pointer Conversions and Aliasing 关于指针转换和别名的注意事项

Three things about pointer conversions are relevant here: 有关指针转换的三件事在这里相关:

  • If the alignment is incorrect, the pointer conversion is not defined by the C standard. 如果对齐方式不正确,则C标准不会定义指针转换。 In particular, in many C implementations, int objects should be four-byte aligned, whereas char objects may have any alignment. 特别地,在许多C实现中, int对象应按四字节对齐,而char对象可以具有任何对齐方式。 If the address in p is not correctly aligned for an int , then the expression (int *)p could cause the program to crash or could cause undesired results. 如果p的地址未针对int正确对齐,则表达式(int *)p可能会导致程序崩溃或导致不良结果。
  • Even if the alignment is correct, the C standard does not guarantee what the result of converting a general char * to an int * is except that converting the result back to char * will yield the original pointer (or an equivalent pointer). 即使对齐正确,C标准也不保证将常规char *转换为int *的结果是什么,除了将结果转换回char *会产生原始指针(或等效指针)。 In many C implementations, this conversion will yield a pointer to the same address, just with a different type. 在许多C实现中,此转换将产生一个指向相同地址的指针,只是类型不同。
  • The expression pp[0] accesses the bytes at p as if they were an int . 表达式pp[0]访问int一样访问p处的字节。 This violates a rule in the C standard, called the aliasing rule, that says an object shall have its value accessed only by an expression using a correct type. 这违反了C标准中称为别名规则的规则,该规则说对象只能通过使用正确类型的表达式访问其值。 There are some details about what types are correct, but an int is never a correct type for a char (or for several char ). 关于什么类型是正确的有一些细节,但是int永远不是一个char (或几个char )的正确类型。 When this rule is violated, the C standard does not define the behavior. 违反此规则时,C标准不会定义行为。

The last point is important because C implementations may or may not support aliasing. 最后一点很重要,因为C实现可能支持也可能不支持别名。 Some C implementations support aliasing (meaning they define the behavior even though the C standard does not) because it was widely used, and they want to support existing code that uses it, or because it is needed in certain types of software. 一些C实现支持别名(这意味着即使C标准没有定义,也可以定义行为),因为它已经被广泛使用,并且他们希望支持使用它的现有代码,或者因为某些类型的软件需要它。 Some C implementations do not support aliasing because this allows them to optimize programs better. 一些C实现不支持别名,因为这使它们可以更好地优化程序。 (If the compiler can assume that an int * never points to a float , when it may be able to avoid reloading float data after assignments through int pointers, since those assignments could not have changed the float data.) Some compilers have switches so you can enable or disabled aliasing support. (如果编译器可以假定int *从未指向float ,则它可以避免通过int指针进行赋值后重新加载float数据,因为这些赋值无法更改float数据。)某些编译器具有开关,因此您可以可以启用或禁用别名支持。

Since aliasing can break your program, you should understand the rules for it, avoid it when not needed, and know how to enable it when needed. 由于别名会破坏您的程序,因此您应该了解它的规则,在不需要时避免使用它,并知道如何在需要时启用它。 In this case, aliasing is not needed to examine the results of reinterpreting the bytes of a string as an int . 在这种情况下,不需要别名来检查将字符串的字节重新解释为int A safe way to do this is to copy the bytes into an int , as with: 一种安全的方法是将字节复制到int ,如下所示:

char *p = "01234567890123456789";
int i;
memcpy(&i, p, sizeof i);
printf("%d\n", i);

This is the result of ((51×256+50)×256+49)×256+48 , where 51 is ASCII code of '3' and 50 is ASCII code of '2' and so on. 这是((51×256+50)×256+49)×256+48 ,其中51是ASCII代码“ 3”,而50是ASCII代码“ 2”,依此类推。 In fact, pp[0] points to 4 bytes of memory (int is 4 bytes), and those 4 bytes are "0123", and int on your machine is little-endian, so '0' (which is 48 in numeric) is LSB and '3' is MSB. 实际上, pp[0]指向4个字节的内存(int为4个字节),而这4个字节为“ 0123”,并且您的计算机上的int为低位字节序,因此为“ 0”(数字为48)是LSB,“ 3”是MSB。

p[1] is one byte after p[0] because p is a pointer to byte array, but pp[1] is 4 bytes after pp[0] because pp is a pointer to int array and int is 4 bytes. p[1]p[0]之后的一个字节,因为p是指向字节数组的指针,但是pp[1]pp[0]之后的4个字节,因为pp是指向int数组的指针,而int是4个字节。

858927408 when converted to hex is 0x33323130 858927408转换为十六进制时为0x33323130

Apperently on your system, you have a little-endian format. 在您的系统上,您显然有一个小端格式。 In this format the LSB of the integer is stored first. 以这种格式,整数的LSB首先存储。

The first 4 bytes of the string are taken for the integer. 字符串的前4个字节取整数。 "0123" The ascii values are 0x30, 0x31, 0x32, 0x33 respectively. "0123"的ascii值分别为0x30, 0x31, 0x32, 0x33 Since this is little-endian. 由于这是小端。 The LSByte of the integer is 0x30 and the MSbyte of the integer is 0x33 . 整数的LSByte为0x30 ,整数的MSbyte为0x33

That is how you get 0x33323130 as an output. 这就是您获得0x33323130作为输出的方式。

Edit Regarding the additional question from OP 编辑关于OP中的其他问题

And if i write printf("%c",pp[0]); 如果我写printf(“%c”,pp [0]); then output is 0 which is correct but when I change pp[0] to pp[1] then output is 4 but how ? 然后输出是0,这是正确的,但是当我将pp [0]更改为pp [1]时,输出是4,但是如何?

When you have %c in printf and give an integer parameter, you are converting the integer to a character ie, the LS byte is taken 0x30 and this is printed as ASCII. 当在printf%c并提供整数参数时,您正在将整数转换为字符,即LS字节为0x30并以ASCII形式打印。

for pp[1] this is the next integer in the array, which is 4 bytes later. 对于pp[1]这是数组中的下一个整数,后4个字节。 So the LS Byte in this case will be 0x34 and 4 is printed after conversion to ASCII. 因此,在这种情况下,LS字节将为0x34并在转换为ASCII后打印4

It just sets the start address of the int object at the beginning of the string. 它只是将int对象的起始地址设置在字符串的开头。 The actual value of the int will depend on endianess and sizeof(int). int的实际值将取决于字节序和sizeof(int)。

as "01234567890123456789" is {0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39 ...} in memory if the endianess are little and sizeof(int) == 4 the value will be 0x0x33323130 . 内存中的"01234567890123456789"{0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39 ...} ,如果内存占用率很小且sizeof(int) == 4该值为0x0x33323130 I the endianess are big the value will be 0x30313233 我的0x30313233很大,值将是0x30313233

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在C中,当我们将int转换为struct *时,内存中会发生什么? - In C, what happens in memory when we do cast int to struct*? 如何解释这一点,当我们将带符号的char转换为int / hex时会发生什么? - How to explain this,what happens when we cast signed char to int/hex? 在 C 中将 int 列表转换为 char 列表时会发生什么? - What happens when a int list is cast as a char list in C? 当我们将值重新分配给 char 指针时,内存会发生什么? - What happens with memory when we reassign value to char pointer? 当地址不以字对齐时,在C中将char *地址转换为int *时会发生什么? - What happens when you cast a char * address to int * in C when the address is not word-aligned? 将int转换为float时在后台会发生什么 - what happens at background when convert int to float 当您键入将整数值转换为char指针时会发生什么? - What happens when you type cast an integer value into a char pointer? 当将整数指针转换为char指针时,实际发生了什么? - What actually happens when a pointer to integer is cast to a pointer to char? 将整数数组转换为char数组时会发生什么 - What happens when you cast an integer array into a char array 当我们写 int x 时会发生什么; 并从 C 语言的主 function 返回操作系统是否为它分配 memory? - What happens when we write int x; and return from the main function in C language does OS allocate memory for it?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM