简体   繁体   中英

java rmi socket connection timeout without real attempt to connect

I have rmi client which connects to several rmi servers on different machines. I sometimes get connection timeout on client. It happens randomly and when I find it out and try to connect to server again it is always successful. There isn't any firewall between machines.

I tried to capture network traffic with Wireshark and there isn no outgoing SYN packet from client so it seems like there is no attempt to establish the connection.

What can be cause of it? How can I find out what is going on there?

Edit: logs from RMI:

4:02:44 AM sun.rmi.transport.tcp.TCPEndpoint newSocket
FINER: task-6786: opening socket to [c1w7IE10-0059:1099]
4:02:44 AM sun.rmi.transport.proxy.RMIMasterSocketFactory createSocket
FINE: task-6786: host: c1w7IE10-0059, port: 1099

Port and host are correct but no SYN packet sent to the host.

Edit: stacktrace

ERROR: Error: com.kerio.at.lib.exception.KRemoteOperationFailedException: java.rmi.ConnectException: Connection refused to host: c1w7IE10-0059; nested exception is: 
    java.net.ConnectException: Connection timed out: connect
com.kerio.at.lib.exception.KRemoteOperationFailedException: java.rmi.ConnectException: Connection refused to host: c1w7IE10-0059; nested exception is: 
    java.net.ConnectException: Connection timed out: connect
    at com.kerio.at.lib.testmanager.rmi.client.TestManagerClient.connect(TestManagerClient.java:88)
    at com.kerio.at.lib.cml.scheduling.RunTestTask.connectToNode(RunTestTask.java:485)
    at com.kerio.at.lib.cml.scheduling.RunTestTask.call(RunTestTask.java:68)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Caused by: java.rmi.ConnectException: Connection refused to host: c1w7IE10-0059; nested exception is: 
    java.net.ConnectException: Connection timed out: connect
    at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
    at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown Source)
    at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown Source)
    at sun.rmi.server.UnicastRef.newCall(Unknown Source)
    at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
    at java.rmi.Naming.lookup(Unknown Source)
    at com.kerio.at.lib.testmanager.rmi.client.TestManagerClient.connect(TestManagerClient.java:59)
    ... 9 more
Caused by: java.net.ConnectException: Connection timed out: connect
    at java.net.DualStackPlainSocketImpl.connect0(Native Method)
    at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
    at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
    at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
    at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
    at java.net.PlainSocketImpl.connect(Unknown Source)
    at java.net.SocksSocketImpl.connect(Unknown Source)
    at java.net.Socket.connect(Unknown Source)
    at java.net.Socket.connect(Unknown Source)
    at java.net.Socket.<init>(Unknown Source)
    at java.net.Socket.<init>(Unknown Source)
    at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown Source)
    at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown Source)

There are a few possible causes:

  • an external firewall is blocking the requests
  • an internal firewall is blocking the requests (think "iptables")
  • if you are on a virtual machine, the requests are being lost due to an intermittent or load-related problem with the virtual networks
  • it could be an intermittent routing problem; eg if your machine has two NICs, then traffic could be misrouted due to misconfigured DHCP
  • your application could occasionally using the wrong IP address; eg if it occasionally used a loopback address that might not show up in your Wireshark captures.

Some of these we can tentatively or positively eliminate. But be careful, because it is possible that your knowledge of your local networking environment is incomplete or out of date.

I turned out to be DNS problem. I switched off dns cache (windows service 'dns client') and the problem disappeared.

Thanks bili for his advice.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM