简体   繁体   English

Java中的连接池和线程池设置

[英]Connection Pool and thread pool setting in Java

Spring application using Hikari pool. Spring 应用程序使用 Hikari 池。

Now for a single request from the client I have to query 10 tables(business required), and then composite the result together.现在对于来自客户端的单个请求,我必须查询 10 个表(业务需要),然后将结果组合在一起。 And querying for each table may cost 50ms to 200ms.查询每个表可能需要 50 毫秒到 200 毫秒。 To speed up the response time, I create a FixedThreadPool in my service to query each table in different thread(pseudocode):为了加快响应时间,我在我的服务中创建了一个FixedThreadPool来查询不同线程中的每个表(伪代码):

class MyService{
    final int THREAD_POOL_SIZE = 20;
    final int CONNECTION_POOL_SIZE = 10;


    final ExecutorService pool = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
    protected DataSource ds;


    MyClass(){
        Class.forName(getJdbcDriverName());
        HikariConfig config = new HikariConfig();
        config.setMaximumPoolSize(CONNECTION_POOL_SIZE);
        ds = new HikariDataSource(config);
    }



    public Items doQuery(){
        String[] tables=["a","b"......]; //10+ tables
        Items result=new Items();
        CompletionService<Items> executorService = new ExecutorCompletionService<Items>(pool);
        for (String tb : tables) {
            Callable<Item> c = () -> {
                Items items = ds.getConnection().query(tb); ......
                return Items;
            };
            executorService.submit(c);
        }


        for (String tb: tables) {
            final Future<Items> future = executorService.take();
            Items items = future.get();
            result.addAll(items);
        }
    }
}

Now for a single request, the average response time maybe 500ms.现在对于单个请求,平均响应时间可能为 500 毫秒。

在此处输入图像描述

But for concurrent requests, the average response time will increase rapidly, the more the requests, the long the response time will be.但是对于并发请求,平均响应时间会迅速增加,请求越多,响应时间就会越长。

在此处输入图像描述

I wonder how to set the proper connection pool size and thread pool size to make the app work effective?我想知道如何设置正确的连接池大小和线程池大小以使应用程序有效工作?

BTW, the database use RDS in cloud with 4 cpu 16GB mem, 2000 max connections and 8000 max IOPS.顺便说一句,该数据库在云中使用 RDS,具有 4 个 cpu 16GB 内存、2000 个最大连接数和 8000 个最大 IOPS。

You might want to think about a few more parameters:您可能需要考虑更多参数:
1. Max concurrent request parameter for the database. 1. 数据库的最大并发请求参数。 Cloud providers have different limits of concurrent requests for different tiers, you might want to check yours.云提供商对不同层的并发请求有不同的限制,您可能需要检查一下。

2. When you say 50-200 ms, although it is difficult to say, are there 8 requests of 50ms and 2 requests of 200ms on an average or all of them pretty much the same? 2. 说50-200ms,虽然很难说,是平均有8个50ms的请求和2个200ms的请求,还是几乎都差不多? Why?为什么? Your doQuery might be limited by the query taking maximum time (which is 200ms), but the threads taking 50 ms will get released after it's task is done making them available for next set of requests.您的 doQuery 可能会受到最长查询时间(即 200 毫秒)的限制,但需要 50 毫秒的线程将在其任务完成后释放,使其可用于下一组请求。

3. What is the QPS you are expecting to receive? 3. 您期望收到的 QPS 是多少?

Some calculations: If a single request takes 10 threads, and you have provisioned 100 connections with 100 concurrent query limit, assuming 200ms for each query, you can only handle 10 requests at a time.一些计算:如果单个请求占用 10 个线程,并且您配置了 100 个连接和 100 个并发查询限制,假设每个查询 200ms,您一次只能处理 10 个请求。 Maybe a little better than 10 if most queries take 50ms or so (but I wouldn't be optimistic).如果大多数查询需要 50 毫秒左右,可能会比 10 好一点(但我不会乐观)。

Of course, some of these calculations goes for a toss if any of your queries takes >200ms (network latency or anything else), in which case I recommend you have a circuit breaker, either at the connection end (if you are allowed to abort the query after a timeout) or at the API end.当然,如果您的任何查询花费超过 200 毫秒(网络延迟或其他任何东西),其中一些计算就会被折腾,在这种情况下,我建议您在连接端有一个断路器(如果您被允许中止超时后的查询)或在 API 结束。

Note : max connection limit is not the same as max concurrent query limit .注意最大连接限制最大并发查询限制不同。

Suggestion: Since you need response under 500ms, You can also have a connectionTimeout of about 100-150ms on the pool.建议:由于需要500ms以下的响应,池上也可以有100-150ms左右的connectionTimeout。 Worst case: 150ms connection timeout + 200ms query execution + 100ms for application processing < 500ms for your response.最坏的情况:150 毫秒连接超时 + 200 毫秒查询执行 + 100 毫秒应用程序处理 < 500 毫秒的响应。 Works.作品。

You can create a custom thread executor您可以创建自定义线程执行器

public class CustomThreadPoolExecutor extends ThreadPoolExecutor {

    private CustomThreadPoolExecutor(int corePoolSize, int maximumPoolSize,
                                     long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue) {
        super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue);
    }

    /**
     * Returns a fixed thread pool where task threads take Diagnostic Context from the submitting thread.
     */

    public static ExecutorService newFixedThreadPool(int nThreads) {
        return new CustomThreadPoolExecutor(nThreads, nThreads,
                0L, TimeUnit.MILLISECONDS,
                new LinkedBlockingQueue<Runnable>());
    }
}

In the configuration, you can configure the ExecutorService bean as below在配置中,您可以配置 ExecutorService bean,如下所示

@Bean
    public ExecutorService executeService() {
        return CustomThreadPoolExecutor.newFixedThreadPool(10);
    }

This is the best practice for creating custom thread pool executor这是创建自定义线程池执行器的最佳实践

The proper way to size the connection pool is usually to leave it at the default.调整连接池大小的正确方法通常是将其保留为默认值。

From the hikari website :来自 hikari 网站

If you have 10,000 front-end users, having a connection pool of 10,000 would be shear insanity.如果您有 10,000 个前端用户,那么拥有 10,000 个连接池将是疯狂的。 1000 still horrible. 1000仍然可怕。 Even 100 connections, overkill.甚至100个连接,矫枉过正。 You want a small pool of a few dozen connections at most, and you want the rest of the application threads blocked on the pool awaiting connections.您想要一个最多包含几十个连接的小池,并且您希望应用程序线程的 rest 在等待连接的池中阻塞。 If the pool is properly tuned it is set right at the limit of the number of queries the database is capable of processing simultaneously -- which is rarely much more than (CPU cores * 2) as noted above.如果对池进行了适当的调整,它将被设置为数据库能够同时处理的查询数量的限制——这很少超过(CPU 核心 * 2),如上所述。

Given you know that each request is going to consume 10 threads, then you want to break with this advice and go for more threads - keeping it to a number less than 100 is probably going to provide enough capacity.假设您知道每个请求将消耗 10 个线程,那么您想要打破这个建议和 go 以获得更多线程 - 将其保持在小于 100 的数字可能会提供足够的容量。

I would implement the controller like this:我会像这样实现 controller:

Make your queries async in your controller / service classes with CompletableFuture s and let the connection pool worry about keeping its threads busy.使用CompletableFuture使您的查询在 controller / 服务类中异步,并让连接池担心保持其线程忙碌。

So the controller could look like this (im adapting this from some other code that doesn't work like this example, so grain of salt with this code):所以 controller 可能看起来像这样(我从其他一些不像这个例子那样工作的代码中改编了这个,所以这个代码是盐粒):

public class AppController { 

    @Autowired private DatabaseService databaseService; 

    public ResponseEntity<Thing> getThing() { 
        CompletableFuture<Foo> foo = CompletableFuture.runAsync(databaseService.getFoo());
        CompletableFuture<Bar> bar = CompletableFuture.runAsync(databaseService.getBar());
        CompletableFuture<Baz> baz = CompletableFuture.runAsync(databaseService.getBaz());

        // muck around with the completable future to return your data in the right way
        // this will be in there somewhere, followed by a .thenApply and .join
        CompletableFuture<Void> allFutures = CompletableFuture.allOf(foo, bar, baz);

        return new ResponseEntity<Thing>(mashUpDbData(cf.get()));
    }    
}

The controller will spawn as many threads as you allow the ForkJoinPool to use, they will hammer the DB all at the same time and the connection pool can worry about keeping the connections active. controller 将产生您允许ForkJoinPool使用的尽可能多的线程,它们将同时敲击数据库,并且连接池可能会担心保持连接处于活动状态。

But I think the reason you see the blowout in response times under small load is that by it's very design JDBC blocks the thread while waiting for the data to come back from the DB.但我认为您在小负载下看到响应时间井喷的原因是,JDBC 在等待数据从数据库返回时阻塞了线程。

To stop the blocking affecting the response times so drastically, you could try the spring boot reactive style.要阻止如此剧烈地影响响应时间的阻塞,您可以尝试spring 引导响应式样式。 This uses async io and backpressure to match IO production to consumption, basically it means that the app threads are as busy as they can be.这使用异步 io 和背压来匹配 IO 生产到消费,基本上这意味着应用程序线程尽可能繁忙。 This should stop that behaviour under load where the response times increase in a linear fashion.应该会在响应时间以线性方式增加的负载下停止该行为。

Note if you do go the reactive path, the jdbc drivers still block, so spring have a big push to create a reactive database driver .请注意,如果您执行 go 反应路径,则 jdbc 驱动程序仍然阻塞,因此 spring 有很大的推动力来创建反应数据库驱动程序

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM