Unix 域套接字：在一个服务器进程和多个客户端进程之间使用数据报通信

Question

I would like to establish an IPC connection between several processes on Linux.我想在 Linux 上的几个进程之间建立 IPC 连接。 I have never used UNIX sockets before, and thus I don't know if this is the correct approach to this problem.我以前从未使用过 UNIX 套接字，因此我不知道这是否是解决此问题的正确方法。

One process receives data (unformated, binary) and shall distribute this data via a local AF_UNIX socket using the datagram protocol (ie similar to UDP with AF_INET).一个进程接收数据（未格式化的、二进制的），并应使用数据报协议（即类似于带有 AF_INET 的 UDP）通过本地 AF_UNIX 套接字分发该数据。 The data sent from this process to a local Unix socket shall be received by multiple clients listening on the same socket.从此进程发送到本地 Unix 套接字的数据应被多个侦听同一个套接字的客户端接收。 The number of receivers may vary.接收器的数量可能会有所不同。

To achieve this the following code is used to create a socket and send data to it (the server process):为了实现这一点，以下代码用于创建一个套接字并向其发送数据（服务器进程）：

struct sockaddr_un ipcFile;
memset(&ipcFile, 0, sizeof(ipcFile));
ipcFile.sun_family = AF_UNIX;
strcpy(ipcFile.sun_path, filename.c_str());

int socket = socket(AF_UNIX, SOCK_DGRAM, 0);
bind(socket, (struct sockaddr *) &ipcFile, sizeof(ipcFile));
...
// buf contains the data, buflen contains the number of bytes
int bytes = write(socket, buf, buflen);
...
close(socket);
unlink(ipcFile.sun_path);

This write returns -1 with errno reporting ENOTCONN ("Transport endpoint is not connected").此写入返回 -1 并带有 errno 报告 ENOTCONN（“传输端点未连接”）。 I guess this is because no receiving process is currently listening to this local socket, correct?我猜这是因为当前没有接收进程正在监听这个本地套接字，对吗？

Then, I tried to create a client who connects to this socket.然后，我尝试创建一个连接到这个套接字的客户端。

struct sockaddr_un ipcFile;
memset(&ipcFile, 0, sizeof(ipcFile));
ipcFile.sun_family = AF_UNIX;
strcpy(ipcFile.sun_path, filename.c_str());

int socket = socket(AF_UNIX, SOCK_DGRAM, 0);
bind(socket, (struct sockaddr *) &ipcFile, sizeof(ipcFile));
...
char buf[1024];
int bytes = read(socket, buf, sizeof(buf));
...
close(socket);

Here, the bind fails ("Address already in use").在这里，绑定失败（“地址已在使用中”）。 So, do I need to set some socket options, or is this generally the wrong approach?那么，我是否需要设置一些套接字选项，或者这通常是错误的方法？

Thanks in advance for any comments / solutions!提前感谢您的任何评论/解决方案！

Answer 1

There's a trick to using Unix Domain Socket with datagram configuration.在数据报配置中使用 Unix Domain Socket 有一个技巧。 Unlike stream sockets (tcp or unix domain socket), datagram sockets need endpoints defined for both the server AND the client.与流套接字（tcp 或 unix 域套接字）不同，数据报套接字需要为服务器和客户端定义的端点。 When one establishes a connection in stream sockets, an endpoint for the client is implicitly created by the operating system.当在流套接字中建立连接时，客户端的端点由操作系统隐式创建。 Whether this corresponds to an ephemeral TCP/UDP port, or a temporary inode for the unix domain, the endpoint for the client is created for you.无论这对应于临时 TCP/UDP 端口，还是 unix 域的临时 inode，都会为您创建客户端的端点。 Thats why you don't normally need to issue a call to bind() for stream sockets in the client.这就是为什么您通常不需要为客户端中的流套接字发出对 bind() 的调用。

The reason you're seeing "Address already in use" is because you're telling the client to bind to the same address as the server.您看到“地址已在使用中”的原因是您告诉客户端绑定到与服务器相同的地址。 bind() is about asserting external identity. bind()是关于断言外部身份。 Two sockets can't normally have the same name.两个套接字通常不能具有相同的名称。

With datagram sockets, specifically unix domain datagram sockets, the client has to bind() to its own endpoint, then connect() to the server's endpoint.使用数据报套接字，特别是 unix 域数据报套接字，客户端必须bind()到它自己的端点，然后connect()到服务器的端点。 Here is your client code, slightly modified, with some other goodies thrown in:这是您的客户端代码，稍作修改，并加入了其他一些好东西：

char * server_filename = "/tmp/socket-server";
char * client_filename = "/tmp/socket-client";

struct sockaddr_un server_addr;
struct sockaddr_un client_addr;
memset(&server_addr, 0, sizeof(server_addr));
server_addr.sun_family = AF_UNIX;
strncpy(server_addr.sun_path, server_filename, 104); // XXX: should be limited to about 104 characters, system dependent

memset(&client_addr, 0, sizeof(client_addr));
client_addr.sun_family = AF_UNIX;
strncpy(client_addr.sun_path, client_filename, 104);

// get socket
int sockfd = socket(AF_UNIX, SOCK_DGRAM, 0);

// bind client to client_filename
bind(sockfd, (struct sockaddr *) &client_addr, sizeof(client_addr));

// connect client to server_filename
connect(sockfd, (struct sockaddr *) &server_addr, sizeof(server_addr));

...
char buf[1024];
int bytes = read(sockfd, buf, sizeof(buf));
...
close(sockfd);

At this point your socket should be fully setup.此时，您的套接字应该已完全设置。 I think theoretically you can use read() / write() , but usually I'd use send() / recv() for datagram sockets.我认为理论上你可以使用read() / write() ，但通常我会使用send() / recv()作为数据报套接字。

Normally you'll want to check error after each of these calls and issue a perror() afterwards.通常，您需要在每次调用之后检查错误并在之后发出perror() 。 It will greatly aid you when things go wrong.当事情出错时，它会极大地帮助你。 In general, use a pattern like this:一般来说，使用这样的模式：

if ((sockfd = socket(AF_UNIX, SOCK_DGRAM, 0)) < 0) {
    perror("socket failed");
}

This goes for pretty much any C system calls.这几乎适用于任何 C 系统调用。

The best reference for this is Steven's "Unix Network Programming".对此最好的参考是 Steven 的“Unix 网络编程”。 In the 3rd edition, section 15.4, pages 415-419 show some examples and lists many of the caveats.在第 3 版中，第 15.4 节，第 415-419 页显示了一些示例并列出了许多注意事项。

By the way, in reference to顺便说一下，参考

I guess this is because no receiving process is currently listening to this local socket, correct?我猜这是因为当前没有接收进程正在监听这个本地套接字，对吗？

I think you're right about the ENOTCONN error from write() in the server.我认为您对服务器中write()的 ENOTCONN 错误是正确的。 A UDP socket would normally not complain because it has no facility to know if the client process is listening. UDP 套接字通常不会抱怨，因为它无法知道客户端进程是否正在侦听。 However, unix domain datagram sockets are different.然而，unix 域数据报套接字是不同的。 In fact, the write() will actually block if the client's receive buffer is full rather than drop the packet.事实上，如果客户端的接收缓冲区已满，则write()实际上会阻塞而不是丢弃数据包。 This makes unix domain datagram sockets much superior to UDP for IPC because UDP will most certainly drop packets when under load, even on localhost.这使得unix域数据报套接字在IPC方面比UDP优越得多，因为UDP在负载下肯定会丢弃数据包，即使在本地主机上也是如此。 On the other hand, it means you have to be careful with fast writers and slow readers.另一方面，这意味着您必须小心处理速度快的作者和慢速的读者。

Answer 2

The proximate cause of your error is that write() doesn't know where you want to send the data to .你的错误的直接原因是， write()不知道你想将数据发送到。 bind() sets the name of your side of the socket - ie. bind()设置您的套接字一侧的名称 - 即。 where the data is coming from .其中，数据的来源。 To set the destination side of the socket, you can either use connect() ;要设置套接字的目标端，您可以使用connect() ； or you can use sendto() instead of write() .或者您可以使用sendto()而不是write() 。

The other error ("Address already in use") is because only one process can bind() to an address.另一个错误（“地址已被使用”）是因为只有一个进程可以bind()到一个地址。

You will need to change your approach to take this into account.你需要改变你的方法来考虑到这一点。 Your server will need to listen on a well-known address, set with bind() .你的服务器需要监听一个众所周知的地址，用bind()设置。 Your clients will need to send a message to the server at this address to register their interest in receiving datagrams.您的客户端需要向此地址的服务器发送一条消息，以注册他们对接收数据报的兴趣。 The server will recieve the registration messages from clients using recvfrom() , and record the address used by each client.服务器将使用recvfrom()接收来自客户端的注册消息，并记录每个客户端使用的地址。 When it wants to send a message, it will have to loop over all the clients it knows about, using sendto() to send the message to each one in turn.当它想要发送一条消息时，它必须遍历它知道的所有客户端，使用sendto()依次将消息发送给每个客户端。

Alternatively, you could use local IP multicast instead of UNIX domain sockets (UNIX domain sockets don't support multicast).或者，您可以使用本地 IP 多播而不是 UNIX 域套接字（UNIX 域套接字不支持多播）。

Answer 3

If the question intended to be about broadcasting (as I understand it), then according to unix(4) - UNIX-domain protocol family , broadcasting it is not available with UNIX Domain Sockets:如果问题是关于广播的（据我所知），那么根据unix(4) - UNIX-domain protocol family ，广播它不适用于 UNIX 域套接字：

The Unix Ns -domain protocol family does not support broadcast addressing or any form of "wildcard" matching on incoming messages. Unix Ns -domain 协议系列不支持广播寻址或传入消息的任何形式的“通配符”匹配。 All addresses are absolute- or relative-pathnames of other Unix Ns -domain sockets.所有地址都是其他 Unix Ns 域套接字的绝对或相对路径名。

May be multicast could be an option, but I feel to know it's not available with POSIX, although Linux supports UNIX Domain Socket multicast .多播可能是一种选择，但我觉得它不适用于 POSIX，尽管Linux 支持 UNIX Domain Socket multicast 。

Also see: Introducing multicast Unix sockets .另请参阅：介绍多播 Unix 套接字。

Answer 4

It will happen because of server or client die before unlink/remove for bind() file associate.它会发生因为服务器或客户端在取消链接/删除绑定（）文件关联之前死亡。 any of client/server using this bind path, try to run server again.任何使用此绑定路径的客户端/服务器，请尝试再次运行服务器。

solutions : when you want to bind again just check that file is already associate then unlink that file.解决方案：当您想再次绑定时，只需检查该文件是否已关联，然后取消链接该文件。 How to step : first check access of this file by access(2);如何步骤：首先通过 access(2) 检查此文件的访问权限； if yes then unlink(2) it.如果是，则取消链接（2）它。 put this peace of code before bind() call,position is independent.把这个和平的代码放在 bind() 调用之前，位置是独立的。

 if(!access(filename.c_str()))
    unlink(filename.c_str());

for more reference read unix(7)更多参考阅读 unix(7)

Answer 5

Wouldn't it be easier to use shared memory or named pipes?使用共享内存或命名管道不是更容易吗？ A socket is a connection between two processes (on the same or a different machine).套接字是两个进程（在同一台或不同的机器上）之间的连接。 It isn't a mass communication method.它不是一种大众传播方式。

If you want to give something to multiple clients, you create a server that waits for connections and then all the clients can connect and it gives them the information.如果你想给多个客户端提供一些东西，你可以创建一个等待连接的服务器，然后所有客户端都可以连接并为它们提供信息。 You can accept concurrent connections by making the program multi-threaded or by forking processes.您可以通过使程序成为多线程或通过分叉进程来接受并发连接。 The server establishes multiple socket-based connections with multiple clients, rather than having one socket that multiple clients connect to.服务器与多个客户端建立多个基于套接字的连接，而不是多个客户端连接到一个套接字。

Answer 6

You should look into IP multicasting instead of Unix-domain anything.您应该研究 IP 多播而不是 Unix 域。 At present you are just trying to write to nowhere.目前你只是想写到无处。 And if you connect to one client you will only be writing to that client.如果您连接到一个客户端，您将只会写入该客户端。

This stuff doesn't work the way you seem to think it does.这些东西并不像你认为的那样工作。

Answer 7

You can solve the bind error with the following code:您可以使用以下代码解决绑定错误：

int use = yesno;
setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, (char*)&use, sizeof(int));

With UDP protocol, you must invoke connect() if you want to use write() or send() , otherwise you should use sendto() instead.对于 UDP 协议，如果要使用write()或send() ，则必须调用connect() send() ，否则应改用sendto() 。

To achieve your requirements, the following pseudo code may be of help:为了满足您的要求，以下伪代码可能会有所帮助：

sockfd = socket(AF_INET, SOCK_DGRAM, 0)
set RESUSEADDR with setsockopt
bind()
while (1) {
   recvfrom()
   sendto()
}

Unix 域套接字：在一个服务器进程和多个客户端进程之间使用数据报通信

问题描述

7 个解决方案

解决方案1
55 2012-04-21 16:32:46

解决方案2
8 2010-07-25 04:14:38

解决方案3
5 2013-03-29 17:26:16

解决方案4
0 2016-05-25 06:02:07

解决方案5
-1 2010-07-24 10:14:35

解决方案6
-2 2010-07-24 10:05:58

解决方案7
-6 2010-09-02 06:50:18

Unix 域套接字：在一个服务器进程和多个客户端进程之间使用数据报通信

问题描述

7 个解决方案

解决方案1 55 2012-04-21 16:32:46

解决方案2 8 2010-07-25 04:14:38

解决方案3 5 2013-03-29 17:26:16

解决方案4 0 2016-05-25 06:02:07

解决方案5 -1 2010-07-24 10:14:35

解决方案6 -2 2010-07-24 10:05:58

解决方案7 -6 2010-09-02 06:50:18

解决方案1
55 2012-04-21 16:32:46

解决方案2
8 2010-07-25 04:14:38

解决方案3
5 2013-03-29 17:26:16

解决方案4
0 2016-05-25 06:02:07

解决方案5
-1 2010-07-24 10:14:35

解决方案6
-2 2010-07-24 10:05:58

解决方案7
-6 2010-09-02 06:50:18