简体   繁体   English

Mac OSX上的Java进程不会释放套接字

[英]Java process on Mac OSX does not release socket

I am experiencing an odd problem every now and then (too often actually). 我经常遇到一个奇怪的问题(实际上经常)。

I am running a server application, which is binding a socket for itself. 我正在运行一个服务器应用程序,它为自己绑定一个套接字。

But once in a while, the socket is not released. 但有一段时间,套接字没有被释放。 The process dies, although Eclipse reports that Terminate failed, however it disappears properly from 'ps' and JConsole/JVisualVM. 虽然Eclipse报告Terminate失败,但是它会从'ps'和JConsole / JVisualVM中正确消失,但是该过程终止了。 'lsof' also displays nothing for the port anymore. 'lsof'也不再为港口显示任何内容。 But still, I get this error when I try to start the server again to the same port: 但是,当我尝试再次启动服务器到同一端口时,我收到此错误:

Caused by: java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind(Native Method)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)

The problem is worst in my unit tests, which never run fully, because this will for sure occur after one of the tests (which all recreate the server). 这个问题在我的单元测试中是最糟糕的,它永远不会完全运行,因为这肯定会在其中一个测试(所有重新创建服务器)之后发生。

I am running MacOSX 10.7.3 我正在运行MacOSX 10.7.3

Java(TM) SE Runtime Environment (build 1.6.0_31-b04-415-11M3635) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01-415, mixed mode) Java(TM)SE运行时环境(版本1.6.0_31-b04-415-11M3635)Java HotSpot(TM)64位服务器VM(版本20.6-b01-415,混合模式)

I have also Parallels, and often the problem looks like it's caused by the Parallels network adapter, but I am not sure if it has anything to do with this problem after all (I have contacted their support without any help so far). 我也有Parallels,而且问题看起来似乎是由Parallels网络适配器引起的,但我不确定它是否与此问题有任何关系(我已经联系了他们的支持,目前为止没有任何帮助)。

The only thing that helps to resolve the situation is to reboot OSX. 唯一有助于解决这种情况的是重启OSX。

Any ideas? 有任何想法吗?

-- -

This is the relevant code to open the socket: 这是打开套接字的相关代码:

channel = (ServerSocketChannel) ServerSocketChannel.open().configureBlocking(false);
 channel.socket().bind( addr, 0 );

and it is closed by 它被关闭了

  channel.close();

But I assume that the process gets stuck here and then Eclipse kills it. 但我认为这个过程在这里被卡住了,然后Eclipse杀了它。

-- -

netstat -an (for port 6007): netstat -an(对于端口6007):

tcp4      73      0  127.0.0.1.6007         127.0.0.1.51549        ESTABLISHED
tcp4       0      0  127.0.0.1.51549        127.0.0.1.6007         ESTABLISHED
tcp4      73      0  127.0.0.1.6007         127.0.0.1.51544        CLOSE_WAIT 
tcp4       0      0  127.0.0.1.6007         127.0.0.1.51543        CLOSE_WAIT 
tcp4       0      0  10.37.129.2.6007       *.*                    LISTEN     
tcp4       0      0  10.211.55.2.6007       *.*                    LISTEN     
tcp4       0      0  127.0.0.1.6007         *.*                    LISTEN     
tcp4       0      0  10.50.100.236.6007     *.*                    LISTEN     

-- -

And now I get this exception after the socket is opened for every test (netstat output from this situation): 现在,在为每个测试打开套接字后,我得到此异常(此情况下的netstat输出):

Caused by: java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:129)
    at java.net.SocketInputStream.read(SocketInputStream.java:182)

-- -

Stopping the process from eclipse I got "Terminate failed", but lsof -i TCP:6007 is displaying nothing and the process is no longer found by 'ps'. 从eclipse停止进程我得到了“终止失败”,但是lsof -i TCP:6007什么也没显示,'ps'找不到进程。 netstat output did not change... netstat输出没有变化......

Can I somehow kill the socket without rebooting (that would help a litte bit already)? 我可以以某种方式杀死套接字而不重新启动(这有助于一点点)?

-- -

UPDATE 5.5.12: 更新5.5.12:

I ran the tests now in Eclipse debugger. 我现在在Eclipse调试器中运行测试。 This time the tests got stuck after 18 methods. 这次测试在18种方法后被卡住了。 I stopped the main thread after it was stuck around 15 minutes. 它被困在15分钟后停止了主线程。 This is the stack: 这是堆栈:

Thread [main] (Suspended)   
    FileDispatcher.preClose0(FileDescriptor) line: not available [native method]    
    SocketDispatcher.preClose(FileDescriptor) line: 41  
    ServerSocketChannelImpl.implCloseSelectableChannel() line: 208 [local variables unavailable]    
    ServerSocketChannelImpl(AbstractSelectableChannel).implCloseChannel() line: 201 
    ServerSocketChannelImpl(AbstractInterruptibleChannel).close() line: 97  
...

-- -

Hmm, it looks like the process is not killed, after all - and does not die to kill -9 either (I noticed that process 712 and probably also 710 are the TestNG processes): 嗯,毕竟看起来这个过程并没有被杀死 - 并且也不会死于杀死-9(我注意到过程712,可能还有710是TestNG进程):

$ kill -9 712
$ ps xa | grep java
  700   ??  ?E     0:00.00 (java)
  712   ??  ?E     0:00.00 (java)
  797 s005  S+     0:00.00 grep java

-- Edit: 10.5.12: - 编辑:10.5.12:

?E in the ps output above means that the process is exiting. ?上面的ps输出中的E表示该进程正在退出。 I could not find any means to kill such a process fully without rebooting. 我找不到任何方法可以在不重新启动的情况下完全杀死这样一个进程。 The same issue has been noticed with some other applications. 其他一些应用程序也注意到了同样的问题。 No solutions found: 找不到解决方案:

http://www.google.com/search?q=ps+process+is+exiting+osx http://www.google.com/search?q=ps+process+is+exiting+osx

尝试在每次测试之后用http://docs.oracle.com/javase/1.4.2/docs/api/java/net/ServerSocket.html#close ()关闭套接字,如果你还没有。

在这里只是一个黑暗的镜头,但要确保等待Selector.select()的任何线程已被唤醒,并已退出。

所以问题似乎在于JDK 6的Mac版本中Selector的实现。安装新的Oracle JDK 7u4修复了这个问题,与Selector的使用方式无关。

I have also Parallels, and often the problem looks like it's caused by the Parallels network adapter.... 我也有Parallels,而且问题通常是由Parallels网络适配器引起的......

I'd say that's a fair bet if this problem is not cropping up on other platforms. 如果这个问题没有在其他平台上出现,我会说这是一个公平的赌注。 What have you done to exclude Parallels as the culprit? 你做了什么来将Parallels排除在罪魁祸首之外?

if you think that the resources are not properly released, you can try to do the release in a shutdownhook. 如果您认为资源未正确发布,您可以尝试在shutdownhook中执行发布。 like this at least when its shut down the resouces will be released (not though if you hard kill) 这样至少在关闭资源时会释放(尽管如果你很难杀死)

an example for a very basic shutdownhook: 一个非常基本的shutdownhook的例子:

public void shutDownProceedure(){
    Runtime.getRuntime().addShutdownHook(new Thread() {
        public void run() {
            /* my shutdown code here */
        }
    });
}

This helped me release resources that somehow weren't entirely released before. 这有助于我释放以前不完全释放的资源。 I don't know if this works for sockets as well, i think it should. 我不知道这是否也适用于套接字,我认为应该这样做。

It also allowed me to see loggings i haven't seen before 它还让我看到了以前没见过的记录

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM