I use reactor-netty
to request a set of URLs. Majority of URLs belong to the same hosts. reactor-netty
seems to make a brand new TCP connection for every URL even if connection to the host is already established for the previous URL. Some servers drop new connections or start to respond slowly when hundreds of simultaneous connections established.
Sample of the code:
Flux.just(...)
.groupBy(link -> {
String host = "";
try {
host = new URL(link).getHost();
} catch (MalformedURLException e) {
LOGGER.warn("Cannot determine host {}", link, e);
}
return host;
})
.flatMap(group -> {
HttpClient client = HttpClient.create()
.keepAlive(true)
.tcpConfiguration(tcp -> tcp.host(group.key()));
return group.flatMap(link -> client.get()
.uri(link)
.response((resp, cont) -> resp.status().code() == 200 ? cont.aggregate().asString() : Mono.empty())
.doOnSubscribe(s -> LOGGER.debug("Requesting {}", link))
.timeout(Duration.ofMinutes(1))
.doOnError(e -> LOGGER.warn("Cannot get response from {}", link, e))
.onErrorResume(e -> Flux.empty())
.collect(Collectors.joining())
.filter(s -> StringUtils.isNotBlank(s)));
})
.blockLast();
In the log I see that local ports are different for the same remote host and sum of active and inactive connections are way higher than the number of distinct hosts. That's why I think that reactor-netty
is not reusing already established connections.
DEBUG [2019-04-29 08:15:18,711] reactor-http-nio-10 r.n.r.PooledConnectionProvider: [id: 0xaed18e87, L:/192.168.1.183:56832 - R:capcp2.naad-adna.pelmorex.com/52.242.33.4:80] Releasing channel
DEBUG [2019-04-29 08:15:18,711] reactor-http-nio-10 r.n.r.PooledConnectionProvider: [id: 0xaed18e87, L:/192.168.1.183:56832 - R:capcp2.naad-adna.pelmorex.com/52.242.33.4:80] Channel cleaned, now 1 active connections and 239 inactive connections
...
DEBUG [2019-04-29 08:15:20,158] reactor-http-nio-10 r.n.r.PooledConnectionProvider: [id: 0xd6c6c5db, L:/192.168.1.183:56965 - R:capcp2.naad-adna.pelmorex.com/52.242.33.4:80] Releasing channel
DEBUG [2019-04-29 08:15:20,158] reactor-http-nio-10 r.n.r.PooledConnectionProvider: [id: 0xd6c6c5db, L:/192.168.1.183:56965 - R:capcp2.naad-adna.pelmorex.com/52.242.33.4:80] Channel cleaned, now 0 active connections and 240 inactive connections
Is it possible to request several URLs on the same host using keep-alive
HTTP client through the same TCP connection to the host? If not, how do I restrict the number of simultaneous connections to the same host or perform requests to the same host sequentially (the next request only after receiving response to the previous one)?
I use Californium-SR6
release train.
Yes, reactor netty supports keep-alive, connection reuse, and connection pooling.
Note that .flatMap
is a async operation that processes the inner streams in parallel. Therefore, when you call group.flatMap(...
the inner requests will be executed in parallel. And since they are executed in parallel, multiple connections will need to be established.
If you want to execute requests to the same host sequentially, change your example to use group.concatMap
instead of .flatMap
.
If you want to still execute them in parallel, but limit the number of active requests to an individual host, then change your example to use one of the overloaded versions of .flatMap
that takes a concurrency
parameter.
Also, since you are using HttpClient.create()
, your example uses the default global http connection pool. If you want more control over connection pooling, you can specify a different ConnectionProvider
via HttpClient.create(ConnectionProvider)
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.