[英]Java NIO nonblocking read returns timeout (linux kernel issue?)
i am running into a similar issue as described here: Java Linux Nonblocking Socket Timeout Behavior 我遇到了类似的问题: Java Linux Nonblocking Socket Timeout Behavior
I have an application implemented with Java NIO. 我有一个用Java NIO实现的应用程序。 It keeps track of a bunch of sockets, and when they're ready for reading, my application will read in a loop (removed code and some logic for brevity): 它跟踪一堆套接字,当它们准备好读取时,我的应用程序将循环读取(删除代码和一些逻辑以简化):
if (selkey.isReadable()) {
int nread;
while (true) {
// read the header
nread = mSocketChannel.read(mHeaderBuffer);
if (nread == -1)
return;
handle_message_header();
// read the body
nread = mSocketChannel.read(mPayloadBuffer);
if (nread == -1)
return;
handle_message_body();
}
}
But very, very rarely I receive a timeout exception in the first read(): 但是很少,我很少在第一次read()中收到超时异常:
java.io.IOException: Connection timed out
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
at sun.nio.ch.IOUtil.read(IOUtil.java:175)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
I digged into the jdk sources, and the read0 function simply calls read() on the socket handle. 我深入研究了jdk源代码,read0函数只是调用socket句柄上的read()。 The "Connection timed out" exception is thrown if read() returns -1 and errno == ETIMEDOUT. 如果read()返回-1并且errno == ETIMEDOUT,则抛出“连接超时”异常。
We do not use soSetTimeout() or the tcp keepalive option. 我们不使用soSetTimeout()或tcp keepalive选项。 And since i was seeing this only on a client's cluster i am not able to reproduce it (nor do i have the output of netstat or other tools). 因为我只在客户端的集群上看到这个,所以我无法重现它(我也没有netstat或其他工具的输出)。
I wonder in which cases does the linux kernel return ETIMEDOUT in a nonblocking read()? 我想知道linux内核在非阻塞read()中返回ETIMEDOUT的情况是什么? Is this a bug or a feature? 这是一个错误还是一个功能?
More information about the machine on which this appeared: 有关出现这种情况的机器的更多信息:
Linux slave1 2.6.18-164.e15 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
CentOS 5.4
Thanks Chris 谢谢克里斯
Edit: According to my log file (and the program flow), the socket was created when the server accepted an incoming connection. 编辑:根据我的日志文件(和程序流程),在服务器接受传入连接时创建套接字。 Then there was at least one successful recv from that socket, but twice the server then failed to write. 然后从该套接字中至少有一个成功的recv,但是服务器的两次失败了。 And then i caught the exception when reading. 然后我在阅读时发现了异常。 The log file does not have much information - i am therefore not 100% sure about my analysis so far. 日志文件没有太多信息 - 因此我对目前的分析并不是100%肯定。 I have added lots of debug output to the socket routines and now i am better prepared for the next time. 我已经为套接字例程添加了大量的调试输出,现在我为下一次做好了准备。
Thanks for all the helpful comments! 感谢所有有用的评论!
You are reading from a connection that you haven't completed properly. 您正在读取未正确完成的连接。 Probably you did the connect in non-blocking mode and you either haven't received the OP_CONNECT
event; 可能你是在非阻塞模式下进行连接而你没有收到OP_CONNECT
事件; you haven't called finishConnect()
; 你还没有调用finishConnect()
; or it didn't return true
. 或者它没有回归true
。
Your client attempted to connect but received no response and eventually timed out. 您的客户尝试连接但未收到任何响应,最终超时。
EJP, Thank you for the correction. EJP,谢谢你的纠正。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.