简体   繁体   English

无法理解C和类型转换中的指针

[英]Unable to understand pointers in C and typecasting

I am unable to understand why the 3rd and 4th printf 's are giving 54 and -61. 我不明白为什么第三和第四printf给出54和-61。 According to me, the program should have given 0 as output because character pointer is expected to display output value up to (sizeof(char) * 8) bits and 54 in binary is 00000000 00110110 . 据我说,该程序应该给出0作为输出,因为预期字符指针最多显示(sizeof(char) * 8)位的输出值,而二进制54则为00000000 00110110

#include<stdio.h>
void main()
{
      int i=54;
      float a=3.14;
      char *ii,*aa;

      ii=(char *)&i;
      aa=(char *)&a;

      printf("%u\n",ii);
      printf("%u\n",aa);
      printf("%d\n",*ii);
      printf("%d\n",*aa);

}

Edit: The fourth printf (if I use %f there, I typed %d by mistake) is giving 0.00000 . 编辑:第四个printf (如果我在那里使用%f ,我误输入%d )给出0.00000 Why? 为什么?

Why is the third output 54? 为什么第三输出54?

Your third output displays 54, because on your machine, 您的第三个输出显示54,因为在您的计算机上,

int i=54;

is stored in memory like this: 像这样存储在内存中:

36 00 00 00

your pointer points here: 您的指针指向此处:

36 00 00 00
^^

And thus when you print out that 0x36 as a char (a one byte long integral type), you see 54. 因此,当您将0x36打印为char (一个字节长的整数类型)时,会看到54。

This storage format is called " little endian ", and is used on x86 and amd64 processors, which are quite common. 这种存储格式称为“ little endian ”,并在非常普遍的x86和amd64处理器上使用。

Note that the language does not guarantee that integers are stored this way; 注意,该语言不能保证以这种方式存储整数。 you may very well get a different result with a different machine or compiler. 您可能会在其他机器或编译器上获得不同的结果。 Don't depend on it. 不要依赖它。

What about the float? 那浮子呢?

The float works similarly, but is much more complicated to show. float工作原理类似,但显示起来却复杂得多。 Again, it's quite machine dependent. 同样,这完全取决于机器。 For an amd64, if you encode 3.14 in an IEEE single (this is platform dependent), and then store the four bytes backwards (at least, I believe amd64 stores them "little endian", though I'm not sure why, since it's a float.¹), the byte value in the first slot, when looked at as a signed 8-bit two's complement integer (this is also platform dependent), should work out to the value you're seeing. 对于amd64,如果您在IEEE单机中编码3.14 (这取决于平台),然后向后存储四个字节(至少,我相信amd64将它们存储为“ little endian”,尽管我不确定为什么,因为它是一个浮点数。),第一个插槽中的字节值当被视为带符号的8位二进制补码整数(这也取决于平台)时,应该算出您所看到的值。

Last, you say: 最后,您说:

i didn't know about little edian. 我不知道小爱丁。 but is that not with float. 但这不是浮点数。 it is giving 0.000000000 if i use %f in place of %d in fourth (by mistake i typed %d here) 如果我在第四位使用%f代替%d,则给出0.000000000(错误地我在这里键入了%d)

I'm going to assume you mean: 我假设你的意思是:

printf("%f\n",*aa);

And that aa is still a char * . 而且那个aa仍然是char * This isn't well-formed: for %f , you need to pass a double or a float . 这格式不正确:对于%f ,您需要传递doublefloat However, let's plow on, and attempt to explain this (undefined!) behavior. 但是,让我们继续努力,尝试解释这种(未定义!)行为。

Since it's a char * , when you dereference it, on your machine, it'll likely read some one-byte value. 由于它是一个char * ,当您取消引用它时,在您的计算机上,它可能会读取一些一字节的值。 3.14 , as a little endian float, is: 3.14作为小端浮点数是:

c3 f5 48 40
^^

0xc3 , as a two's complement signed one byte integer, is -61, which explains your question. 0xc3是一个带二进制补码的1字节整数,它是-61,它解释了您的问题。 Thus, for your program *aa is -61. 因此,对于您的程序, *aa是-61。 When you pass this to printf , it'll be promoted to an int , because printf is a "varargs" (variable number of arguments) function. 当您将其传递给printf ,它将被提升为int ,因为printf是一个“ varargs”(可变参数数量)函数。 You can see this when compiling in some compilers: 在某些编译器中进行编译时,您会看到以下信息:

prog1.c:14:7: warning: format '%f' expects argument of type 'double', but argument 2 has type 'int' [-Wformat] prog1.c:14:7:警告:格式'%f'期望类型为'double'的参数,但是参数2的类型为'int'[-Wformat]

Thus, an "int" will get passed to printf in whatever manner your platform uses. 因此,“ int”将以您的平台使用的任何方式传递给printf Let's investigate that. 让我们调查一下。 For explicitness, I'm compiling the following: 为了明确起见,我正在编译以下内容:

#include<stdio.h>
int main()
{
    int i=54;
    float a=3.14;
    char *ii,*aa;

    ii=(char *)&i;
    aa=(char *)&a;

    printf("%u\n",ii);
    printf("%u\n",aa);
    printf("%d\n",*ii);
    printf("%f\n",*aa);

    return 0;
}

I do: 我做:

% gcc -g -o prog1 prog1.c
prog1.c: In function ‘main’:
prog1.c:11:2: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 2 has type ‘char *’ [-Wformat]
prog1.c:12:2: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 2 has type ‘char *’ [-Wformat]
prog1.c:14:2: warning: format ‘%f’ expects argument of type ‘double’, but argument 2 has type ‘int’ [-Wformat]

(In case it isn't clear: gcc is throwing really good warnings here: it's pointing out undefined behavior — bugs — in your program. You should always fix these. We're going to ignore them to investigate, but note that the compiler can really do whatever it wants at this point, so everything below is anything but guaranteed.) (以防万一: gcc在这里抛出了非常好的警告:它指出了程序中未定义的行为(错误)。您应该始终修复这些错误。我们将忽略它们进行调查,但是请注意编译器可以真正做到这一点,因此,下面的所有内容都无法保证。)

Then, let's start this is a debugger, and stop on that last printf. 然后,让我们启动这是一个调试器,然后停止最后一个printf。 For me, that's line 14. Thus: 对我来说,是第14行。因此:

% gdb prog1
GNU gdb (Gentoo 7.6.2 p1) 7.6.2
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>...
Reading symbols from /home/me/code/random/prog1...done.
(gdb) break prog1.c:14
Breakpoint 1 at 0x4005db: file prog1.c, line 14.

Let's run it up to that breakpoint. 让我们运行到那个断点。

(gdb) r
Starting program: /home/me/code/random/prog1 
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
4294959628
4294959624
54

Breakpoint 1, main () at prog1.c:14
14      printf("%f\n",*aa);

Now we're stopped on the " printf ", but what does that mean? 现在我们停在“ printf ”上,那是什么意思呢? Let's look at some assembler! 让我们看一些汇编器!

(gdb) disassemble
Dump of assembler code for function main:
   0x000000000040056c <+0>: push   %rbp
   0x000000000040056d <+1>: mov    %rsp,%rbp
   0x0000000000400570 <+4>: sub    $0x20,%rsp
   0x0000000000400574 <+8>: movl   $0x36,-0x14(%rbp)
   0x000000000040057b <+15>:    mov    0x12f(%rip),%eax        # 0x4006b0
   0x0000000000400581 <+21>:    mov    %eax,-0x18(%rbp)
   0x0000000000400584 <+24>:    lea    -0x14(%rbp),%rax
   0x0000000000400588 <+28>:    mov    %rax,-0x8(%rbp)
   0x000000000040058c <+32>:    lea    -0x18(%rbp),%rax
   0x0000000000400590 <+36>:    mov    %rax,-0x10(%rbp)
   0x0000000000400594 <+40>:    mov    -0x8(%rbp),%rax
   0x0000000000400598 <+44>:    mov    %rax,%rsi
   0x000000000040059b <+47>:    mov    $0x4006a4,%edi
   0x00000000004005a0 <+52>:    mov    $0x0,%eax
   0x00000000004005a5 <+57>:    callq  0x400450 <printf@plt>
   0x00000000004005aa <+62>:    mov    -0x10(%rbp),%rax
   0x00000000004005ae <+66>:    mov    %rax,%rsi
   0x00000000004005b1 <+69>:    mov    $0x4006a4,%edi
   0x00000000004005b6 <+74>:    mov    $0x0,%eax
   0x00000000004005bb <+79>:    callq  0x400450 <printf@plt>
   0x00000000004005c0 <+84>:    mov    -0x8(%rbp),%rax
   0x00000000004005c4 <+88>:    movzbl (%rax),%eax
   0x00000000004005c7 <+91>:    movsbl %al,%eax
   0x00000000004005ca <+94>:    mov    %eax,%esi
   0x00000000004005cc <+96>:    mov    $0x4006a8,%edi
   0x00000000004005d1 <+101>:   mov    $0x0,%eax
   0x00000000004005d6 <+106>:   callq  0x400450 <printf@plt>
=> 0x00000000004005db <+111>:   mov    -0x10(%rbp),%rax
   0x00000000004005df <+115>:   movzbl (%rax),%eax
   0x00000000004005e2 <+118>:   movsbl %al,%eax
   0x00000000004005e5 <+121>:   mov    %eax,%esi
   0x00000000004005e7 <+123>:   mov    $0x4006ac,%edi
   0x00000000004005ec <+128>:   mov    $0x0,%eax
   0x00000000004005f1 <+133>:   callq  0x400450 <printf@plt>
   0x00000000004005f6 <+138>:   mov    $0x0,%eax
   0x00000000004005fb <+143>:   leaveq 
   0x00000000004005fc <+144>:   retq   

That's main , and the arrow ( => ) is where we are. 那是main ,箭头( => )在这里。 The call instruction at 0x00000000004005f1 is the call to your fourth printf , and as you can see, there's some setup required to call it: all those mov instructions. call指令0x00000000004005f1是对第四个printf的调用,如您所见,调用它需要一些设置:所有这些mov指令。 Since they set up the call, and what we're interested in is what get's passed to printf , we'll need to let them run, so we need to step the program up to just right at that call instruction. 由于他们设置了调用,而我们感兴趣的是传递给printf ,因此我们需要让它们运行,因此我们需要按照该call指令逐步执行程序。 We can do this with another breakpoint: 我们可以使用另一个断点来做到这一点:

(gdb) break *0x00000000004005f1
Breakpoint 2 at 0x4005f1: file prog1.c, line 14.
(gdb) continue
Continuing.

Breakpoint 2, 0x00000000004005f1 in main () at prog1.c:14
14      printf("%f\n",*aa);

Now we're at that call statement. 现在,我们在该call声明中。 Now, because I'm on an amd64 chip (an Intel Core i7. These are also sometimes referred to x86-64.) and I'm not running Windows, for me, we call a function by putting the arguments, from left to right, into certain registers. 现在,因为我使用的是amd64芯片(Intel Core i7。有时也称为x86-64。),而我没有运行Windows,对我来说,我们通过将参数从左到右调用函数。对,进入某些寄存器。 From the right, the first argument is *aa , which remember, we've established to be -61. 从右边开始,第一个参数是*aa ,请记住,我们已将其设置为-61。 We can dump our registers with: 我们可以通过以下方式转储我们的寄存器:

(gdb) info all-registers
rax            0x0  0
rbx            0x0  0
rcx            0x2  2
rdx            0x7ffff7dd7820   140737351874592
rsi            0xffffffc3   4294967235
rdi            0x4006ac 4196012
rbp            0x7fffffffe220   0x7fffffffe220
rsp            0x7fffffffe1f8   0x7fffffffe1f8
r8             0x2  2
r9             0x7ffff7dd4640   140737351861824
r10            0x7fffffffe0d8   140737488347352
r11            0x246    582
r12            0x400480 4195456
r13            0x7fffffffe300   140737488347904

[ snip … ]

ymm0           {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x0, 0x0, 0x0, 0x0}, v32_int8 = {0x0, 0x0, 0x0, 0x0, 0xff, 0x0, 0x0, 0x0, 
    0xff, 0x0, 0x0, 0x0, 0xff, 0x0 <repeats 19 times>}, v16_int16 = {0x0, 0x0, 0xff, 0x0, 0xff, 0x0, 0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 
  v8_int32 = {0x0, 0xff, 0xff, 0xff, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0xff00000000, 0xff000000ff, 0x0, 0x0}, v2_int128 = {0x000000ff000000ff000000ff00000000, 
    0x00000000000000000000000000000000}}

Since -61 is an integer, it ends up in an integer register, here, we can see that it's in rsi . 因为-61是整数,所以它以整数寄存器结尾,在这里,我们可以看到它在rsi (It's been sign extended, which is why it's 0xffffffc3 : -61 in 4 bytes, instead of one.) However, %f , being a float, will most likely read a floating point register, such as ymm0 on my machine. (已经过符号扩展,这就是为什么它在4个字节中不是0xffffffc3 :-61,而不是一个字节。)但是, %f是一个浮点,很可能会读取一个浮点寄存器,例如本机上的ymm0 It happens to be zero. 恰好是零。 That doesn't need to be true, since this is undefined behavior, but, it is, and thus, we'll get zero. 这不一定是正确的,因为这是未定义的行为,但是确实如此,因此我们将得到零。

¹This isn't one of those things you care about often, except for morbid curiosity. ¹除了病态的好奇心外,这不是您经常关心的事情之一。
²The only part I can't explain is why our integer ended up in rsi . ²我唯一无法解释的部分是为什么我们的整数以rsi结尾。 I feel like it should have been in rdi . 我觉得应该在rdi Like I said, morbid curiosity. 就像我说的,病态的好奇心。 ( Edit: Ugh, curse my curiosity. It ends up in rdi because rdi is used for the second argument, and it's the second argument. Wikipedia has it labelled as "right to left", but that only applies to stuff on the stack: registers are assigned left to right.) 编辑: gh,诅咒我的好奇心。它以rdi结尾,因为rdi用于第二个参数,它是第二个参数。Wikipedia将其标记为“从右到左”,但这仅适用于堆栈中的内容:寄存器从左到右分配。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM