一段时间后，写入操作始终为连接的非阻塞套接字返回EAGAIN

Question

I'm writing a client server socket program in c on a ubuntu linux box. 我在ubuntu linux盒子上用c编写客户端服务器套接字程序。 the server side needs to handle many connections and both the server and client have a local socket to send received data to a local process after some manipulating it and the number of sent and received data are huge. 服务器端需要处理许多连接，服务器和客户端都具有本地套接字，以便在对其进行一些操作之后将接收到的数据发送到本地进程，并且发送和接收的数据数量巨大。 (the size of data is not very huge, maximum 1500) here is the diagram: （数据大小不是很大，最大为1500）是下面的图：
[local process of client] <-> data <-> client <---------> server <-> data <-> [local process of server] [客户端的本地进程] <->数据<->客户端<--------->服务器<->数据<-> [服务器的本地进程]

so all the sockets (client_local_socket, client_remote_socket, server_remote_socket,server_local_socket) need to be nonblock. 因此所有套接字（client_local_socket，client_remote_socket，server_remote_socket，server_local_socket）都必须是非阻塞的。

when I run the client and server in two computers in lan network, it works grate, but when move the server program to a linux server in internet (client connects to the server behind a nat) client starts communicating successfully with the server (both client and server get some EAGAIN error, but recover it after the next tries and as I know its pretty normal for nonblock) but after a while (more than 1000 send and receive packets), the client_remote_socket fails in writing with error code EAGAIN and can not recover it in the next tries and after that, it always gets this damn EAGAIN for writing. 当我在局域网中的两台计算机上运行客户端和服务器时，它可以正常工作，但是当将服务器程序移至Internet中的linux服务器（客户端连接到nat后面的服务器）时，客户端开始与服务器成功通信（两个客户端）和服务器收到一些EAGAIN错误，但在下一次尝试后恢复，并且我知道它对于非阻塞来说非常正常），但过了一会儿（超过1000个发送和接收数据包），client_remote_socket写入错误代码EAGAIN失败，并且无法在下一次尝试中恢复它，然后，它总是得到该死的EAGAIN进行写入。 BTW client_remote_socket has no problem in reading and allways get packets from server. 顺便说一句client_remote_socket在读取时没有问题，并且始终从服务器获取数据包。 the server has no problem at all and client_local_socket works grate in both writing and reading. 服务器完全没有问题，client_local_socket在写入和读取过程中都发挥作用。

I've used this code to make sockets nonblock: 我使用以下代码使套接字成为非阻塞的：

int flags;
if ((flags = fcntl(client_remote_socket, F_GETFL, 0)) < 0)
    flags = 0;
flags = flags | O_NONBLOCK;
fcntl(client_remote_socket, F_SETFL, flags);

I also have tried it with: 我也尝试过：

fcntl(client_remote_socket, F_SETFL, O_NONBLOCK);

but the results is the same. 但是结果是一样的。

the only setsockopt that I've used is SO_REUSEADDR in the server side and client has no setsockopt. 我使用的唯一setsockopt是服务器端的SO_REUSEADDR，而客户端没有setsockopt。

its good to mention that I always check the value that write returns and when it is <0 I check the the errno and see its EAGAIN. 值得一提的是，我总是检查write返回的值，当它小于0时，我检查errno并查看其EAGAIN。 As I know, write returns EAGAIN, when the kernel has no available space for the write buffer and it does not make any sense that kernel has no memory for me in laptop with a 4 GB ram. 据我所知，当内核没有用于写缓冲区的可用空间，并且在具有4 GB内存的笔记本电脑中内核对我没有内存时，写操作将返回EAGAIN。 and BTW it works grate when I run both client and server in a lan network. 当我在局域网中同时运行客户端和服务器时，BTW可以正常工作。 when this hapens in client, server does not show any sign of broken client socket, and its right, because in the meantime, it can receive data from the server. 当这种情况在客户端中发生时，服务器不会显示客户端套接字损坏的任何迹象，这是正确的，因为与此同时，它可以从服务器接收数据。 I double checked the code again and again and tried to debug it many times and could not see anything wrong. 我一遍又一遍地检查了代码，并尝试对其进行多次调试，没有发现任何错误。 I also used the select system call to check if the socket is available for writing and it always returns 0 when the time comes. 我还使用了select系统调用来检查套接字是否可用于写入，并且当时间到来时总是返回0。 now I have no clue to solve this and any ideas would be very grate for me. 现在我不知道要解决这个问题，任何想法对我来说都会非常感激。 thanks. 谢谢。

Answer 1

I have gotten the same problem last week and after doing a research I found that it's because the peer's buffer is full. 上周我遇到了同样的问题，经过研究后，我发现这是因为对等方的缓冲区已满。 I tested this case. 我测试了这种情况。

When the remote buffer is full, it tells your local stack to stop sending. 当远程缓冲区已满时，它告诉您的本地堆栈停止发送。 When data is cleared from the remote buffer (by being read by the remote application) then the remote system will inform the local system to send more data. 当从远程缓冲区中清除数据（通过由远程应用程序读取）时，远程系统将通知本地系统发送更多数据。

This is answer of Brian White https://stackoverflow.com/a/14244450/3728361 这是布莱恩·怀特的答案https://stackoverflow.com/a/14244450/3728361

一段时间后，写入操作始终为连接的非阻塞套接字返回EAGAIN

问题描述

1 个解决方案

解决方案1
2 2014-07-04 04:11:17

一段时间后，写入操作始终为连接的非阻塞套接字返回EAGAIN

问题描述

1 个解决方案

解决方案1 2 2014-07-04 04:11:17

解决方案1
2 2014-07-04 04:11:17