简体   繁体   English

AF_UNIX 套接字路径中“\0hidden”的目的是什么?

[英]What is the purpose of "\0hidden" in an AF_UNIX socket path?

I followed a tutorial on how to make two processes on Linux communicate using the Linux Sockets API, and that's the code it showed to make it happen: I followed a tutorial on how to make two processes on Linux communicate using the Linux Sockets API, and that's the code it showed to make it happen:

Connecting code:连接代码:

char* socket_path = "\0hidden";
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
struct sockaddr_un addr;
memset(&addr, 0x0, sizeof(addr));
addr.sun_family = AF_UNIX;
*addr.sun_path = '\0';
strncpy(addr.sun_path+1, socket_path+1, sizeof(addr.sun_path)-2);
connect(fd, (struct sockaddr*)&addr, sizeof(addr));

Listening code:监听代码:

char* socket_path = "\0hidden";
struct sockaddr_un addr;
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
memset(&addr, 0x0, sizeof(addr));
addr.sun_family = AF_UNIX;
*addr.sun_path = '\0';
strncpy(addr.sun_path+1, socket_path+1, sizeof(addr.sun_path)-2);
bind(fd, (struct sockaddr*)&addr, sizeof(addr));
listen(fd, 5);

Basically, I have written a web server for a website in C, and a database management system in C++, and making them communicate (after a user's browser sends an HTTP request to my web server, which it's listening for using an AF_INET family socket, but that's not important here, just some context) using this mechanism.The database system is listening with its socket, and the web server connects to it using its own socket. Basically, I have written a web server for a website in C, and a database management system in C++, and making them communicate (after a user's browser sends an HTTP request to my web server, which it's listening for using an AF_INET family socket,但这在这里并不重要,只是一些上下文)使用这种机制。数据库系统正在使用其套接字进行侦听,并且 web 服务器使用自己的套接字连接到它。 It's been working perfectly fine.它一直工作得很好。

However, I never understood what the purpose of a null byte at the beginning of the socket path is.但是,我从来不明白套接字路径开头的 null 字节的目的是什么。 Like, what the heck does "\0hidden" mean, or what does it do?比如, "\0hidden"到底是什么意思,或者它有什么作用? I read the manpage on sockets, it says something about virtual sockets, but it's too technical for me to get what's going on.我阅读了 sockets 上的联机帮助页,它说了一些关于虚拟 sockets 的内容,但它对我来说太技术性了,无法了解正在发生的事情。 I also don't have a clear understanding of the concept of representing sockets as files with file descriptors.我对将 sockets 表示为具有文件描述符的文件的概念也没有清晰的理解。 I don't understand the role of the strncpy() either.我也不明白strncpy()的作用。 I don't even understand how the web server finds the database system with this code block, is it because their processes were both started from executables in the same directory, or is it because the database system is the only process on the entire system listening on an AF_UNIX socket, or what?我什至不明白 web 服务器是如何找到具有此代码块的数据库系统的,是因为它们的进程都是从同一目录中的可执行文件启动的,还是因为数据库系统是整个系统上唯一监听的进程在 AF_UNIX 套接字上,还是什么?

If someone could explain this piece of the Linux Sockets API that has been mystifying me for so long, I'd be really grateful.如果有人能解释一下一直让我迷惑不解的 Linux Sockets API,我将不胜感激。 I've googled and looked at multiple places, and everyone simply seems to be using "\0hidden" without ever explaining it, as if it's some basic thing that everyone should know.我用谷歌搜索并查看了多个地方,每个人似乎都只是在使用"\0hidden"而没有解释它,好像这是每个人都应该知道的一些基本知识。 Like, am I missing some piece of theory here or what?就像,我在这里遗漏了一些理论还是什么? Massive thanks to anybody explaining in advance!非常感谢任何提前解释的人!

\0 just puts a NUL character into the string. \0只是将一个NUL字符放入字符串中。 As a NUL characters is used to terminate a string, to all C string functions socket_path looks like an empty string, while in fact it is not but they would stop processing it after the first character.由于NUL字符用于终止字符串,因此对于所有 C 字符串函数, socket_path看起来像一个空字符串,而实际上它不是但它们会在第一个字符之后停止处理它。

So im memory socket_path actually looks like this:所以我 memory socket_path实际上看起来像这样:

char socket_path[] = { `\0`, `h`, `i`, `d`, `d`, `e`, `n`, `\0` };

As all strings automatically get a terminating NUL attached.因为所有字符串都会自动附加一个终止的NUL

The line线

strncpy(addr.sun_path+1, socket_path+1, sizeof(addr.sun_path)-2);

copies the bytes of socket_path to the socket address structure addr , yet skipping the first ( NUL ) byte as well as the last one (also NUL ).socket_path的字节复制到套接字地址结构addr ,但跳过第一个( NUL )字节以及最后一个(也是NUL )。 Thus the address of the socket effectively is just the word "hidden" .因此,套接字的地址实际上就是"hidden"这个词。

But as the first byte is left out from the addr.sun_path as well and this byte has been initialized to NUL by memset before, the actual path is still \0hidden .但是由于addr.sun_path中的第一个字节也被遗漏了,并且这个字节之前已经被memset初始化为NUL ,所以实际路径仍然是\0hidden

So why would anyone do that?那么为什么会有人这样做呢? Probably to hide the socket, as normally systems show UNIX sockets in the file system as actual path entries but no file system I'm aware of can handle the \0 character.可能是为了隐藏套接字,因为通常系统在文件系统中显示 UNIX sockets 作为实际路径条目,但我知道没有文件系统可以处理\0字符。 So if the name has a \0 character, it won't appear in the file system, yet such a characters is only allowed as the very first characters, otherwise the system would still try to create that path entry and fail and thus the socket creating would fail.因此,如果名称有一个\0字符,它不会出现在文件系统中,但这样的字符只允许作为第一个字符,否则系统仍会尝试创建该路径条目并失败,因此套接字创建会失败。 Only as the first characters, the system will not even try to create it, which means you cannot see that socket by just calling ls in terminal and whoever wants to connect to it needs to know the name.仅作为第一个字符,系统甚至不会尝试创建它,这意味着您无法通过在终端中调用ls来看到该套接字,并且任何想要连接到它的人都需要知道它的名称。

Note that this is not POSIX conform, as POSIX expects UNIX sockets to always appear in the file system and thus only characters that are legal for the file system in use are allowed as socket name.请注意,这不符合 POSIX 标准,因为 POSIX 期望 UNIX sockets 始终出现在文件系统中,因此只允许使用对正在使用的文件系统合法的字符作为套接字名称。 This will only work on Linux.这仅适用于 Linux。

This is specific to the Linux kernel implementation of the AF_UNIX local sockets.这特定于AF_UNIX本地 sockets 的 Linux kernel 实现。 If the character array which gives a socket name is an empty string, then the name doesn't refer to anything in the filesystem namespace;如果给出套接字名称的字符数组是一个空字符串,则该名称不引用文件系统命名空间中的任何内容; the remaining bytes of the character array are treated as an internal name sitting in the kernel's memory.字符数组的剩余字节被视为位于内核 memory 中的内部名称。 Note that this name is not null-terminated;请注意,此名称不是以 null 结尾的; all bytes in the character array are significant, regardless of their value.字符数组中的所有字节都是重要的,无论它们的值如何。 (Therefore it is a good thing that your example program is doing a memset of the structure to zero bytes before copying in the name.) (因此,您的示例程序在复制名称之前将结构的memset执行为零字节是一件好事。)

This allows applications to have named socket rendezvous points that are not occupying nodes in the filesystem, and are therefore are more similar to TCP or UDP port numbers (which also don't sit in the file system).这允许应用程序在文件系统中指定不占用节点的套接字集合点,因此更类似于 TCP 或 UDP 端口号(它们也不位于文件系统中)。 These rendezvous points disappear automatically when all sockets referencing them are closed.当所有引用它们的 sockets 关闭时,这些集合点会自动消失。

Nodes in the file system have some disadvantages.文件系统中的节点有一些缺点。 Creating and accessing them requires a storage device.创建和访问它们需要一个存储设备。 To prevent that, they can be created in a temporary filesystem that exists in RAM like tmpfs in Linux;为了防止这种情况,可以在内存中存在的临时文件系统中创建它们,例如 Linux 中的tmpfs but tmpfs entries are almost certainly slower to access and take more RAM than a specialized entry in the AF_UNIX implementation.但是tmpfs条目几乎肯定比AF_UNIX实现中的专用条目访问速度更慢并且占用更多 RAM。 Sockets that are needed temporarily (eg while an application is running) may stay around if the application crashes, needing external intervention to clean them up. Sockets 如果应用程序崩溃,则临时需要(例如,在应用程序运行时)可能会保留,需要外部干预来清理它们。

hidden is probably not a good name for a socket; hidden可能不是套接字的好名字; programs should take advantage of the space and use something quasi-guaranteed not to clash with anyone else.程序应该利用空间并使用准保证不会与其他任何人发生冲突的东西。 The name allows over 100 characters, so it's probably a good idea to use some sort of UUID string.该名称允许超过 100 个字符,因此使用某种 UUID 字符串可能是个好主意。

The Linux Programmer's Manual man page calls this kind of address "abstract". Linux 程序员手册man页将这种地址称为“抽象”。 It is distinct and different from "unnamed".它与“无名”截然不同。

Any standard AF_UNIX implementation provides "unnamed" sockets which can be created in two ways: any AF_UNIX socket that has been created with socket but not given an address with bind is unamed;任何标准的AF_UNIX实现都提供“未命名的”sockets,它可以通过两种方式创建:任何已使用socket创建但未通过bind指定地址的AF_UNIX套接字是未命名的; and the pair of sockets created by socketpair are unnamed.而socketpair创建的一对socketpair是未命名的。

For more information, see有关详细信息,请参阅

man 7 unix

in some GNU/Linux distro that has the Linux Man Pages installed.在某些安装了 Linux 手册页的 GNU/Linux 发行版中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM