简体   繁体   English

如何处理ServerSocketChannel.accept()IOException:NIO中打开的文件太多了?

[英]How do I handle ServerSocketChannel.accept() IOException: too many open files in NIO?

I'm having a problem with one of my servers, on Friday morning I got the following IOException: 我的服务器有问题,星期五早上我收到了以下IOException:

11/Sep/2015 01:51:39,524 [ERROR] [Thread-1] - ServerRunnable: IOException: 
java.io.IOException: Too many open files
    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ~[?:1.7.0_75]
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241) ~[?:1.7.0_75]
    at com.watersprint.deviceapi.server.ServerRunnable.acceptConnection(ServerRunnable.java:162) [rsrc:./:?]
    at com.watersprint.deviceapi.server.ServerRunnable.run(ServerRunnable.java:121) [rsrc:./:?]
    at java.lang.Thread.run(Thread.java:745) [?:1.7.0_75]

Row 162 of the ServerRunnable class is in the method below, it's the ssc.accept() call. ServerRunnable类的第162行在下面的方法中,它是ssc.accept()调用。

private void acceptConnection(Selector selector, SelectionKey key) {

    try {
        ServerSocketChannel ssc = (ServerSocketChannel) key.channel();
        SocketChannel sc = ssc.accept();
        socketConnectionCount++;

        /*
         * Test to force device error, for debugging purposes
         */
        if (brokenSocket
                && (socketConnectionCount % brokenSocketNumber == 0)) {

            sc.close();

        } else {

            sc.configureBlocking(false);
            log.debug("*************************************************");
            log.debug("Selector Thread: Client accepted from "
                    + sc.getRemoteAddress());

            SelectionKey newKey = sc.register(selector,
                    SelectionKey.OP_READ);
            ClientStateMachine clientState = new ClientStateMachine();
            clientState.setIpAddress(sc.getRemoteAddress().toString());
            clientState.attachSelector(selector);
            clientState.attachSocketChannel(sc);
            newKey.attach(clientState);

        }

    } catch (ClosedChannelException e) {

        log.error("ClosedChannelException: ", e);
        ClientStateMachine clientState = (ClientStateMachine)key.attachment();
        database.insertFailedCommunication(clientState.getDeviceId(),
                clientState.getIpAddress(),
                clientState.getReceivedString(), e.toString());
        key.cancel();

    } catch (IOException e) {
        log.error("IOException: ", e);

    }

}

How should I handle this? 我该怎么处理? reading up on the error it appears to be a setting in the Linux OS that limits the number of open files a process can have. 阅读错误,它似乎是Linux操作系统中的一个设置,它限制了进程可以拥有的打开文件的数量。 Judging from that, and this question here , it appears that I am not closing sockets correctly (The server is currently serving around 50 clients). 从这一点来看, 这个问题 ,似乎我没有正确关闭套接字(服务器目前服务大约50个客户端)。 Is this a situation where I need a timer to monitor open sockets and time them out after an extended period? 这是一种情况,我需要一个计时器来监视打开的套接字并在一段时间后将它们计时?

I have some cases where a client can connect and then not send any data once the connection is established. 在某些情况下,客户端可以连接,然后在建立连接后不发送任何数据。 I thought I had handled those cases properly. 我以为我已经妥善处理了这些案件。

It's my understanding that a non-blocking NIO server has very long timeouts, is it possible that if I've missed cases like this they might accumulate and result in this error? 我的理解是,非阻塞的NIO服务器有很长的超时时间,如果我错过了这样的情况,它们可能会累积并导致此错误吗?

This server has been running for three months without any issues. 此服务器已运行三个月,没有任何问题。 After I go through my code and check for badly handled / missing cases, what's the best way to handle this particular error? 在我查看代码并检查处理不当/丢失的情况后,处理此特定错误的最佳方法是什么? Are there other things I should consider that might contribute to this? 我还应该考虑其他可能有助于此的事情吗?

Also, (Maybe this should be another question) I have log4j2 configured to send emails for log levels of error and higher, yet I didn't get an email for this error. 此外,(也许这应该是另一个问题)我有log4j2配置为发送电子邮件的日志级别的错误和更高,但我没有收到此错误的电子邮件。 Are there any reasons why that might be? 有什么理由可以吗? It usually works, the error was logged to the log file as expected, but I never got an email about it. 它通常工作,错误被记录到日志文件中,但我从未收到有关它的电子邮件。 I should have gotten plenty as the error occurred every time a connection was established. 每次建立连接时都会发生错误,我应该已经充足了。

You fix your socket leaks. 您修复了套接字泄漏。 When you get EOS, or any IOException other than SocketTimeoutException, on a socket you must close it. 当您在套接字上获得EOS或除SocketTimeoutException,之外的任何IOException SocketTimeoutException,您必须将其关闭。 In the case of SocketChannels, that means closing the channel. SocketChannels,的情况下SocketChannels,这意味着关闭频道。 Merely cancelling the key, or ignoring the issue and hoping it will go away, isn't sufficient. 仅仅取消关键,或忽略问题并希望它会消失,是不够的。 The connection has already gone away. 连接已经消失了。

The fact that you find it necessary to count broken socket connections, and catch ClosedChannelException, already indicates major logic problems in your application. 您发现有必要计算损坏的套接字连接并捕获ClosedChannelException,已经表明您的应用程序中存在严重的逻辑问题。 You shouldn't need this. 你不应该需要这个。 And cancelling the key of a closed channel doesn't provide any kind of a solution. 取消封闭渠道的密钥并不能提供任何解决方案。

It's my understanding that a non-blocking NIO server has very long timeouts 我的理解是,非阻塞的NIO服务器有很长的超时时间

The only timeout a non-blocking NIO server has is the timeout you specify to select() . 非阻塞NIO服务器唯一超时是指定select()的超时。 All the timeouts built-in to the TCP stack are unaffected by whether you are using NIO or non-blocking mode. 内置于TCP堆栈的所有超时不受您使用NIO还是非阻塞模式的影响。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM