简体   繁体   English

堵头返回EAGAIN

[英]Blocking socket returns EAGAIN

One of my projects on Linux uses blocking sockets. 我在Linux上的一个项目使用阻塞套接字。 Things happen very serially so non-blocking would just make things more complicated. 事情是非常连续地发生的,因此非阻塞只会使事情变得更加复杂。 Anyway, I am finding that often a recv() call is returning -1 with errno set to EAGAIN . 无论如何,我发现通常recv()调用会返回-1并将errno设置为EAGAIN

The man page only really mentions this happening for non-blocking sockets, which makes sense. man页只真正提到了非阻塞套接字的这种情况,这是有道理的。 With non-blocking, the socket may or may not be available so you might need to try again. 使用非阻塞,套接字可能会或可能不会,因此您可能需要重试。

What would cause it to happen for a blocking socket? 是什么原因导致套接字阻塞? Can I do anything to avoid it? 我可以做些什么来避免它?

At the moment, my code to deal with it looks something like this (I have it throw an exception on error, but beyond that it is a very simple wrapper around recv() ): 此刻,我处理它的代码看起来像这样(我让它在错误时引发异常,但除此之外,它是一个非常简单的环绕recv()包装器):

int ret;
do {
    ret = ::recv(socket, buf, len, flags | MSG_NOSIGNAL);
} while(ret == -1 && errno == EAGAIN);


if(ret == -1) {
    throw socket_error(strerror(errno));
}
return ret;

Is this even correct? 这是正确的吗? The EAGAIN condition gets hit pretty often. EAGAIN条件经常受到打击。

EDIT: some things which I've noticed which may be relevant. 编辑:我注意到的一些事情可能是相关的。

  1. I do set a read timeout on the socket using setsockopts() , but it is set to 30 seconds. 我确实使用setsockopts()在套接字上设置了读取超时,但是将其设置为30秒。 the EAGAIN 's happen way more often than once every 30 secs. EAGAIN发生的频率比每30秒发生一次的频率高。 CORRECTION my debugging was flawed, EAGAIN 's don't happen as often as I thought they did. 更正我的调试有缺陷, EAGAIN的发生频率不像我想象的那么频繁。 Perhaps it is the timeout triggering. 也许是超时触发。

  2. For connecting, I want to be able to have connect timeout, so I temporarily set the socket to non-blocking. 对于连接,我希望能够具有连接超时,因此我将套接字临时设置为非阻塞。 That code looks like this: 该代码如下所示:

     int error = 0; fd_set rset; fd_set wset; int n; const SOCKET sock = m_Socket; // set the socket as nonblocking IO const int flags = fcntl (sock, F_GETFL, 0); fcntl(sock, F_SETFL, flags | O_NONBLOCK); errno = 0; // we connect, but it will return soon n = ::connect(sock, addr, size_addr); if(n < 0) { if (errno != EINPROGRESS) { return -1; } } else if (n == 0) { goto done; } FD_ZERO(&rset); FD_ZERO(&wset); FD_SET(sock, &rset); FD_SET(sock, &wset); struct timeval tval; tval.tv_sec = timeout; tval.tv_usec = 0; // We "select()" until connect() returns its result or timeout n = select(sock + 1, &rset, &wset, 0, timeout ? &tval : 0); if(n == 0) { errno = ETIMEDOUT; return -1; } if (FD_ISSET(sock, &rset) || FD_ISSET(sock, &wset)) { socklen_t len = sizeof(error); if (getsockopt(SOL_SOCKET, SO_ERROR, &error, &len) < 0) { return -1; } } else { return -1; } done: // We change the socket options back to blocking IO if (fcntl(sock, F_SETFL, flags) == -1) { return -1; } return 0; 

The idea is that I set it to non-blocking, attempt a connect and select on the socket so I can enforce a timeout. 我的想法是将其设置为非阻塞,尝试连接并在套接字上进行选择,以便强制超时。 Both the set and restore fcntl() calls return successfully, so the socket should end up in blocking mode again when this function completes. set和restore fcntl()调用均成功返回,因此该函数完成后,套接字应再次以阻塞模式结束。

您可能在套接字上设置了非零的接收超时(通过setsockopt(sock, SOL_SOCKET, SO_RCVTIMEO,...) ),因为这也会导致recv返回EAGAIN

Is it possible that you're using MSG_DONTWAIT is being specified as part of your flags? 您使用的MSG_DONTWAIT是否有可能被指定为标志的一部分? The man page says EAGAIN will occur if no data is available and this flag is specified. man页上说,如果没有可用数据并且指定了此标志,则将发生EAGAIN

If you really want to force a block until the recv() is somewhat successful, you may wish to use the MSG_WAITALL flag. 如果您确实想强制执行一个块,直到recv()取得一定的成功,则可能希望使用MSG_WAITALL标志。

EAGAIN is generated by the OS almost like an "Oops! I'm sorry for disturbing you.". EAGAIN由操作系统生成,就像“糟糕!很抱歉打扰您。”。 In case of this error, you may try reading again, This is not a serious or fatal error. 如果发生此错误,您可以尝试再次阅读,这不是严重或致命的错误。 I have seen these interrupts occur in Linux and LynxOS anywhere from one a day to 100 times a day. 我已经看到这些中断在Linux和LynxOS中发生,每天发生一次,每天发生100次。

我不建议您先尝试此修复程序,但如果您没有所有选择,则始终可以在套接字上select()并保持相当长的超时时间,以强制其等待数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM