简体   繁体   中英

Java process on Mac OSX does not release socket

I am experiencing an odd problem every now and then (too often actually).

I am running a server application, which is binding a socket for itself.

But once in a while, the socket is not released. The process dies, although Eclipse reports that Terminate failed, however it disappears properly from 'ps' and JConsole/JVisualVM. 'lsof' also displays nothing for the port anymore. But still, I get this error when I try to start the server again to the same port:

Caused by: java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind(Native Method)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)

The problem is worst in my unit tests, which never run fully, because this will for sure occur after one of the tests (which all recreate the server).

I am running MacOSX 10.7.3

Java(TM) SE Runtime Environment (build 1.6.0_31-b04-415-11M3635) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01-415, mixed mode)

I have also Parallels, and often the problem looks like it's caused by the Parallels network adapter, but I am not sure if it has anything to do with this problem after all (I have contacted their support without any help so far).

The only thing that helps to resolve the situation is to reboot OSX.

Any ideas?

--

This is the relevant code to open the socket:

channel = (ServerSocketChannel) ServerSocketChannel.open().configureBlocking(false);
 channel.socket().bind( addr, 0 );

and it is closed by

  channel.close();

But I assume that the process gets stuck here and then Eclipse kills it.

--

netstat -an (for port 6007):

tcp4      73      0  127.0.0.1.6007         127.0.0.1.51549        ESTABLISHED
tcp4       0      0  127.0.0.1.51549        127.0.0.1.6007         ESTABLISHED
tcp4      73      0  127.0.0.1.6007         127.0.0.1.51544        CLOSE_WAIT 
tcp4       0      0  127.0.0.1.6007         127.0.0.1.51543        CLOSE_WAIT 
tcp4       0      0  10.37.129.2.6007       *.*                    LISTEN     
tcp4       0      0  10.211.55.2.6007       *.*                    LISTEN     
tcp4       0      0  127.0.0.1.6007         *.*                    LISTEN     
tcp4       0      0  10.50.100.236.6007     *.*                    LISTEN     

--

And now I get this exception after the socket is opened for every test (netstat output from this situation):

Caused by: java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:129)
    at java.net.SocketInputStream.read(SocketInputStream.java:182)

--

Stopping the process from eclipse I got "Terminate failed", but lsof -i TCP:6007 is displaying nothing and the process is no longer found by 'ps'. netstat output did not change...

Can I somehow kill the socket without rebooting (that would help a litte bit already)?

--

UPDATE 5.5.12:

I ran the tests now in Eclipse debugger. This time the tests got stuck after 18 methods. I stopped the main thread after it was stuck around 15 minutes. This is the stack:

Thread [main] (Suspended)   
    FileDispatcher.preClose0(FileDescriptor) line: not available [native method]    
    SocketDispatcher.preClose(FileDescriptor) line: 41  
    ServerSocketChannelImpl.implCloseSelectableChannel() line: 208 [local variables unavailable]    
    ServerSocketChannelImpl(AbstractSelectableChannel).implCloseChannel() line: 201 
    ServerSocketChannelImpl(AbstractInterruptibleChannel).close() line: 97  
...

--

Hmm, it looks like the process is not killed, after all - and does not die to kill -9 either (I noticed that process 712 and probably also 710 are the TestNG processes):

$ kill -9 712
$ ps xa | grep java
  700   ??  ?E     0:00.00 (java)
  712   ??  ?E     0:00.00 (java)
  797 s005  S+     0:00.00 grep java

-- Edit: 10.5.12:

?E in the ps output above means that the process is exiting. I could not find any means to kill such a process fully without rebooting. The same issue has been noticed with some other applications. No solutions found:

http://www.google.com/search?q=ps+process+is+exiting+osx

尝试在每次测试之后用http://docs.oracle.com/javase/1.4.2/docs/api/java/net/ServerSocket.html#close ()关闭套接字,如果你还没有。

在这里只是一个黑暗的镜头,但要确保等待Selector.select()的任何线程已被唤醒,并已退出。

所以问题似乎在于JDK 6的Mac版本中Selector的实现。安装新的Oracle JDK 7u4修复了这个问题,与Selector的使用方式无关。

I have also Parallels, and often the problem looks like it's caused by the Parallels network adapter....

I'd say that's a fair bet if this problem is not cropping up on other platforms. What have you done to exclude Parallels as the culprit?

if you think that the resources are not properly released, you can try to do the release in a shutdownhook. like this at least when its shut down the resouces will be released (not though if you hard kill)

an example for a very basic shutdownhook:

public void shutDownProceedure(){
    Runtime.getRuntime().addShutdownHook(new Thread() {
        public void run() {
            /* my shutdown code here */
        }
    });
}

This helped me release resources that somehow weren't entirely released before. I don't know if this works for sockets as well, i think it should.

It also allowed me to see loggings i haven't seen before

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM