简体   繁体   English

Linux 内核如何处理 TCP/IP 堆栈上的结构填充?

[英]How does the Linux kernel handle structure padding on the TCP/IP stack?

I'm somewhat familiar with the kernel's socket buffer system, and I searched a lot but I've been unable to find how the kernel handles the problem of struct padding.我对内核的套接字缓冲区系统有些熟悉,我搜索了很多,但我一直无法找到内核如何处理结构填充问题。 How does the kernel pack the bytes of the outgoing TCP/IP packet so that code running on a different platform can make sense of it?内核如何打包传出 TCP/IP 数据包的字节,以便在不同平台上运行的代码可以理解它?

When sending data from one machine to another, I know you can't just send your structs as is.当从一台机器向另一台机器发送数据时,我知道你不能按原样发送你的结构。 Yet that is what looks to be happening with the code in the Linux kernel.然而,Linux 内核中的代码似乎正在发生这种情况。 What am I missing?我错过了什么?

Since you did not refer to a specific bit of code I can only talk about things in general.由于您没有参考特定的代码,因此我只能谈论一般情况。

I searched a lot but I've been unable to find how the kernel handles the problem of struct padding.我搜索了很多,但我一直无法找到内核如何处理结构填充问题。

GCC provides mechanisms to ensure there is no padding between struct members. GCC 提供了确保结构成员之间没有填充的机制。 One such mechanism is the packed attribute.一种这样的机制是packed属性。 This way you can define a struct and know exactly what the memory layout of the struct will be.通过这种方式,您可以定义一个结构体并确切地知道该结构体的内存布局是什么。

How does the kernel pack the bytes of the outgoing TCP/IP packet so that code running on a different platform can make sense of it?内核如何打包传出 TCP/IP 数据包的字节,以便在不同平台上运行的代码可以理解它?

TCP/IP defines what the memory layout of the TCP and IP headers will be. TCP/IP 定义了 TCP 和 IP 标头的内存布局。 You can find information on them here .您可以在此处找到有关它们的信息

When sending data from one machine to another, I know you can't just send your structs as is.当从一台机器向另一台机器发送数据时,我知道你不能按原样发送你的结构。

Well actually you can, you just have to be really cautious about how you do it, which basically the Linux is.嗯,实际上你可以,你只需要非常谨慎地对待你的做法,这基本上是 Linux。 Just sending a struct say through TCP socket to another program with the same struct definition is dangerous for a few reasons.由于一些原因,仅通过 TCP 套接字将结构 say 发送到具有相同结构定义的另一个程序是危险的。 Take the following struct:采用以下结构:

struct my_struct {
    uint32 foo;
    uint64 bar;
}

One reason why people say you shouldn't just send a struct is the memory layout of this struct could be different on different machines or with different compilers.人们说你不应该只发送一个结构体的一个原因是这个结构体的内存布局在不同的机器上或不同的编译器上可能会有所不同。 For example on a 32 bit machine there probably won't be any padding, on a 64 bit machine their might be 32 bits of padding between foo and bar .例如,在 32 位机器上可能不会有任何填充,在 64 位机器上它们可能是foobar之间的 32 位填充。 I use words like probably and might because the compiler isn't forced to add padding;我使用可能可能这样的词因为编译器不会被迫添加填充; its just an optimization it might do.它只是它可能做的优化。 Even if the machines are both 64 bit if you use a different compiler you could get different results as different compilers might add or not add padding.即使机器都是 64 位,如果您使用不同的编译器,您可能会得到不同的结果,因为不同的编译器可能会添加或不添加填充。 There is also the issue of endianness , so if you are on a little endian machine you should convert to the big endian as that is what the network byte order is specified to be.还有一个字节序的问题,所以如果你在一个小端机器上,你应该转换为大端,因为这是网络字节顺序指定的。 Another issue to consider, which my example does not, is that certain types will have different sizes, again depending on the compiler and architecture.另一个需要考虑的问题,我的例子没有,是某些类型将有不同的大小,同样取决于编译器和体系结构。 So for example size_t might be 32 bits on a 32 bit machine and 64 bits on a 64 bit machine.例如, size_t在 32 位机器上可能是 32 位,在 64 位机器上可能是 64 位。 So the same code on a different machine will produce a struct that is a different size.因此,不同机器上的相同代码将产生不同大小的结构。 However, if you use types that have specific bit widths, as in my example, this is not an issue.但是,如果您使用具有特定位宽的类型,如在我的示例中,这不是问题。

Now if you take care of all the issues, which the Linux kernel does do, then you can just send a struct.现在,如果你把所有的问题,这些问题的Linux内核并做护理,那么你可以只发送一个结构。

For more information about why in general sending a struct over TCP is a bad idea this SO question might be useful.有关为什么通常通过 TCP 发送结构是一个坏主意的更多信息, 这个 SO 问题可能有用。 As the top answer as of right now states there are three main reasons (the same ones I outlined here), but if you take care of them it is possible.正如目前的最佳答案所述,主要有三个原因(与我在此处概述的原因相同),但如果您注意它们,则有可能。 While it is probably not a good practice for a user-space program at some point something has to do this as things such as the TCP packet have specific field requirements.虽然在某些时候对于用户空间程序来说这可能不是一个好的做法,但必须这样做,因为诸如 TCP 数据包之类的东西具有特定的字段要求。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM