简体   繁体   中英

How to detect dataloss with Java sockets?

I have the following situation: using a "classical" Java server (using ServerSocket) I would like to detect (as rapidly as possible) when the connection with the client failed unexpectedly (ie. non-gracefully / without a FIN packet).

The way I'm simulating this is as follows:

  • I'm running the server on a Linux box
  • I connect with telnet to the box
  • After the connection has succeeded I add "DROP" rule in the box's firewall

What happens is that the sending blocks after ~10k of data. I don't know for how long, but I've waited more than 10 minutes on several occasions. What I've researched so far:

  • Socket.setSoTimeout - however this affects only reads. If there are only writes, it doesn't have an effect
  • Checking for errors with PrintWriter.checkError(), since PW swallows the exceptions - however it never returns true

How could I detect this error condition, or at least configure the timeout value? (either at the JVM or at the OS level)

Update : after ~20min checkError returned true on the PrintWriter (using the server JVM 1.5 on a CentOS machine). Where is this timeout value configured?

The ~20 min timeout is because of standard TCP settings in Linux. It's really not a good idea to mess with them unless you know what you're doing. I had a similar project at work, where we were testing connection loss by disconnecting the network cable and things would just hang for a long time, exactly like you're seeing. We tried messing with the following TCP settings, which made the timeout quicker, but it caused side effects in other applications where connections would be broken when they shouldn't, due to small network delays when things got busy.

net.ipv4.tcp_retries2
net.ipv4.tcp_syn_retries

If you check the man page for tcp (man tcp) you can read about what these settings mean and maybe find other settings that might apply. You can either set them directly under /proc/sys/net/ipv4 or use sysctl.conf. These two were the ones we found made the send/recv fail quicker. Try setting them both to 1 and you'll see the send call fail a lot faster. Make sure to take not of the current settings before changing them.

I will reiterate that you really shouldn't mess with these settings. They can have side effects on the OS and other applications. The best solution is like Kitson says, use a heartbeat and/or application level timeout.

Also look into how to create a non-blocking socket, so that the send call won't block like that. Although keep in mind that sending with a non-blocking socket is usually successful as long as there's room in the send buffer. That's why it takes around 10k of data before it blocks, even though you broke the connection before that.

The only sure fire way is to generate application level "checks" instead of relying on the transport level. For example, a bi-directional heartbeat message, where if either end does not get the expected message, it closes and resets the connection.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM