HttpServer very slow with keepalive

The following HttpServer program easily handles 8000 requests/s without HTTP keepalive, but a measly 22 requests/s with keepalive.

import java.io.IOException;
import java.io.OutputStream;
import java.net.InetSocketAddress;

import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpHandler;
import com.sun.net.httpserver.HttpServer;

public class HSTest {
    public static void main(String[] args) throws IOException {
        HttpServer hs = HttpServer.create(new InetSocketAddress(30006), 1000);
        hs.createContext("/", new HttpHandler() {
            public void handle(HttpExchange he) throws IOException {
                byte[] FILE = "xxxxx".getBytes();       
                he.sendResponseHeaders(200, FILE.length);           
                OutputStream os = he.getResponseBody();

Here's how it looks with keepalive:

带有HTTP keepalive的Wireshark屏幕截图

Note the huge delays at packets 6, 12 and 17. After the first one, they're always just a little bit over 40ms. In contrast, without keepalive everything's fine:

没有HTTP Keepalive的Wireshark屏幕截图

That's 3 whole requests before the first ms is over!

I'm using OpenJDK 8 on debian sid Linux amd64, with both client and server on the same machine and communicating via a localhost interface. To test, I'm using ab -n 100000 -c 1 http://localhost:30006/ (no keepalive) and ab -n 100000 -c 1 -k http://localhost:30006/ (keepalive), as well as curl and chromium (both with keepalive by default).

So what is causing the 40ms delay with HTTP keepalive requests, and how do I make my server fast?

Like hinted in the comments, I think the main cause of concern here is that it is not "normal" to require HTTP throughput to be extremely high over a single connection (without tweaking away from default settings). If you would get similarly disastrous numbers when allowing multiple clients (eg the -c 100 flag to ab), that would be a different issue. KeepAlive overall has the effect of hogging threads on one-thread-per-connection servers.

I think what you are observing is related to TCP_NODELAY (Nagle's algorithm), possibly accompanied by "delayed acks". The no keepalive case is short enough in terms of the number of packets that you are never hit by it.

https://eklitzke.org/the-caveats-of-tcp-nodelay specifically mentions delays of "up to 40 ms" on Linux http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7068416 mentions a Java property for enabling TCP_NODELAY within the basic Java HTTP server. I am quite confident that you'll see different behavior if enabling this flag.

Another avenue would be changing the delayed ack timeout, to something different than 40 ms. See eg https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/1.3/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-General_System_Tuning-Reducing_the_TCP_delayed_ack_timeout.html

