为什么这两个应该相同的指针指向不同的数据？

Question

I'm writing a FAT16 driver in GNU C for a hobby operating system, and I have a structure defined as such:我在 GNU C 中为一个爱好操作系统编写了一个 FAT16 驱动程序，我有一个这样定义的结构：

struct directory_entry {
  uint8_t name[11];
  uint8_t attrib;
  uint8_t name_case;
  uint8_t created_decimal;
  uint16_t created_time;
  uint16_t created_date;
  uint16_t accessed_date;
  uint16_t ignore;
  uint16_t modified_time;
  uint16_t modified_date;
  uint16_t first_cluster;
  uint32_t length;
} __attribute__ ((packed));

I was under the impression that name would be at the same address as the whole struct, and that attrib would be 11 bytes after that.我的印象是name将与整个结构位于同一地址，并且在那之后该attrib将是 11 个字节。 And indeed, (void *)e.name - (void *)&e is 0 and (void *)&e.attrib - (void *)&e is 11, where e is of type struct directory_entry .实际上， (void *)e.name - (void *)&e是 0， (void *)&e.attrib - (void *)&e是 11，其中e是struct directory_entry类型。

In my kernel, a void pointer to e is passed to a function which reads its contents from a disk.在我的 kernel 中，指向e的 void 指针被传递给从磁盘读取其内容的 function。 After this function, *(uint8_t *)&e is 80 and *((uint8_t *)&e + 11 is 8, as expected for what's on the disk. However, e.name[0] and e.attrib both are 0.在此 function 之后， *(uint8_t *)&e为 80， *((uint8_t *)&e + 11为 8，正如磁盘上所预期的那样。但是， e.name[0]和e.attrib均为 0。

What gives here?这里给出了什么？ Am I misunderstanding how __attribute__ ((packed)) works?我是否误解了__attribute__ ((packed))的工作原理？ Other structs with the same attribute work how I expect at other parts of my kernel.具有相同属性的其他结构按我对 kernel 其他部分的预期工作。 I can post a link to the full source if needed.如果需要，我可以发布完整来源的链接。

Edit: The full source is in this gitlab repository , on the stack-overflow branch.编辑：完整的源代码在这个 gitlab 存储库中，在stack-overflow分支上。 The relevant part is lines 34 to 52 of src/kernel/main.c.相关部分是 src/kernel/main.c 的第 34 到 52 行。 I'm sure that the data is being populated right, as I check *(uint8_t *)&e and *((uint8_t *)&e + 11) .我确定数据填充正确，因为我检查*(uint8_t *)&e和*((uint8_t *)&e + 11) 。 When I run it, the following is output by that part:当我运行它时，该部分是 output ：

(void *)e.name - *(void *)&e
  => 0
*(uint8_t *)&e
  => 80
e.name[0]
  => 0
(void *)&e.attrib - (void *)&e
  => 11
*((uint8_t *)&e + 11)
  => 8
e.attrib
  => 0

I'm very confused about why e.name[0] would be any different than *(uint8_t *)&e .我对为什么e.name[0]与*(uint8_t *)&e有任何不同感到非常困惑。

Edit 2: I disassembled this part using objdump, to see what the difference was in the compiled code, but now I'm even more confused.编辑 2：我使用 objdump 反汇编了这部分，看看编译后的代码有什么不同，但现在我更加困惑了。 u8_dec(*(uint8_t *)&e, nbuf); and u8_dec(e.name[0], nbuf);和u8_dec(e.name[0], nbuf); are both compiled to: (comments mine)都编译为：（评论我的）

lea   eax, [ebp - 0x30] ;loads address of e from stack into eax
movzx eax, byte [eax]   ;loads byte pointed to by eax into eax, zero-extending
movzx eax, al           ;not sure why this is here, as it's already zero-extended
sub esp, 0x8
push  0x31ce0 ;nbuf
push  eax     ;the byte we loaded
call  0x3162f ;u8_dec
add esp, 0x10

This passes in the first byte of the struct, as expected.正如预期的那样，这会传入结构的第一个字节。 I'm sure that u8_dec doesn't modify e, as its first argument is passed by value and not by reference.我确信u8_dec不会修改 e，因为它的第一个参数是按值传递的，而不是按引用传递的。 nbuf is an array declared at file-scope, while e is declared at function scope, so it's not that they overlap or anything. nbuf是在文件范围内声明的数组，而e在 function scope 中声明，因此它们不是重叠或任何东西。 Perhaps u8_dec isn't doing its job right?也许u8_dec没有做好它的工作？ Here's the source of that:这是它的来源：

void u8_dec(uint8_t n, uint8_t *b) {
  if (!n) {
    *(uint16_t *)b = '0';
    return;
  }
  bool zero = false;
  for (uint32_t m = 100; m; m /= 10) {
    uint8_t d = (n / m) % 10;
    if (zero)
      *(b++) = d + '0';
    else if (d) {
      zero = true;
      *(b++) = d + '0';
    }
  }
  *b = 0;
}

It's pretty clear now that packed structs do work how I think they do, but I'm still not sure what's causing the problem.现在很清楚，打包结构确实按照我的想法工作，但我仍然不确定是什么导致了问题。 I'm passing the same value to a function that should be deterministic, but I'm getting different results on different calls.我将相同的值传递给应该是确定性的 function，但是在不同的调用中我得到不同的结果。

Answer 1

My kernel utilizes 32-bit protected mode segmenting.我的 kernel 使用 32 位保护模式分段。 I had my data segment as 0x0000.0000 - 0x000f.ffff and my stack segment as 0x0003.8000 - 0x0003.ffff, to trigger a general protection fault if the stack over overflowed, rather than allowing it to overflow into other kernel data and code.我的数据段为 0x0000.0000 - 0x000f.ffff，堆栈段为 0x0003.8000 - 0x0003.ffff，以在堆栈溢出时触发一般保护错误，而不是让它溢出到其他 kernel 数据和代码中.

However, when GCC compiles C code, it assumes that the stack and data segments have the same base, as this is most often the case.但是，当 GCC 编译 C 代码时，它假定堆栈和数据段具有相同的基数，因为这是最常见的情况。 This was causing a problem as when I took the address of the local variable, it was relative to the stack segment (as local variables are on the stack), but when I dereferenced the pointer in the function that was called, it was relative to the data segment.这导致了一个问题，因为当我获取局部变量的地址时，它是相对于堆栈段的（因为局部变量在堆栈上），但是当我取消引用被调用的 function 中的指针时，它是相对于数据段。

I have changed my segmenting model so that the stack is in the data segment instead of its own segment, and this has fixed the problem.我已经更改了我的分段 model 以便堆栈位于数据段而不是它自己的段中，这已经解决了问题。

为什么这两个应该相同的指针指向不同的数据？

问题描述

1 个解决方案

解决方案1
4 已采纳 2020-05-25 13:25:32

为什么这两个应该相同的指针指向不同的数据？

问题描述

1 个解决方案

解决方案1 4 已采纳 2020-05-25 13:25:32

解决方案1
4 已采纳 2020-05-25 13:25:32