简体   繁体   English

在连接池中打开和关闭数据库连接的成本是多少?

[英]How costly is opening and closing of a DB connection in Connection Pool?

If we use any connection pooling framework or Tomcat JDBC pool then how much it is costly to open and close the DB connection?如果我们使用任何连接池框架或 Tomcat JDBC 池,那么打开和关闭数据库连接的成本是多少?

Is it a good practice to frequently open and close the DB connection whenever DB operations are required?每当需要进行数据库操作时,频繁打开和关闭数据库连接是一种好习惯吗?

Or same connection can be carried across different methods for DB operations?或者相同的连接可以通过不同的方法进行数据库操作?

Jdbc Connection goes through the network and usually works over TCP/IP and optionally with SSL. Jdbc 连接通过网络,通常通过 TCP/IP 和可选的 SSL 工作。 You can read this post to find out why it is expensive.你可以阅读这篇文章来了解它为什么很贵。

You can use a single connection across multiple methods for different db operations because for each DB operations you would need to create a Statement to execute.您可以跨多个方法使用单个连接来执行不同的数据库操作,因为对于每个数据库操作,您都需要创建一个语句来执行。

Connection pooling avoids the overhead of creating Connections during a request and should be used whenever possible.连接池避免了在请求期间创建连接的开销,应尽可能使用。 Hikari is one of the fastest. Hikari 是最快的之一。

The answer is - its almost always recommended to re-use DB Connections.答案是 - 几乎总是建议重新使用数据库连接。 Thats the whole reason why Connection Pools exist.这就是连接池存在的全部原因。 Not only for the performance, but also for the DB stability.不仅是为了性能,也是为了数据库的稳定性。 For instance, if you don't limit the number of connections and mistakenly open 100s of DB connections, the DB might go down.例如,如果您不限制连接数并错误地打开了 100 个 DB 连接,则 DB 可能会宕机。 Also lets say if DB connections don't get closed due to some reason (Out of Memory error / shut down / unhandled exception etc), you would have a bigger issue.还可以说,如果由于某种原因(内存不足错误/关闭/未处理的异常等)没有关闭数据库连接,那么您将遇到更大的问题。 Not only would this affect your application but it could also drag down other services using the common DB.这不仅会影响您的应用程序,而且还会拖累使用公共数据库的其他服务。 Connection pool would contain such catastrophes.连接池会包含这样的灾难。

What people don't realize that behind the simple ORM API there are often 100s of raw SQLs.人们没有意识到,在简单的 ORM API 背后通常有 100 多个原始 SQL。 Imagine running these sqls independent of connection pools - we are talking about a very large overhead.想象一下独立于连接池运行这些 sql - 我们正在谈论一个非常大的开销。

I couldn't fathom running a commercial DB application without using Connection Pools.如果不使用连接池,我无法理解运行商业数据库应用程序。

Some good resources on this topic:关于这个主题的一些很好的资源:
https://www.cockroachlabs.com/blog/what-is-connection-pooling/ https://stackoverflow.blog/2020/10/14/improve-database-performance-with-connection-pooling/ https://www.cockroachlabs.com/blog/what-is-connection-pooling/ https://stackoverflow.blog/2020/10/14/improve-database-performance-with-connection-pooling/

Whether the maintenance (opening, closing, testing) of the database connections in a DBConnection Pool affects the working performance of the application depends on the implementation of the pool and to some extent on the underlying hardware. DBConnection Pool 中数据库连接的维护(打开、关闭、测试)是否影响应用程序的工作性能取决于池的实现,在一定程度上取决于底层硬件。

A pool can be implemented to run in its own thread, or to initialise all connections during startup (of the container), or both.可以将池实现为在其自己的线程中运行,或在(容器的)启动期间初始化所有连接,或两者兼而有之。 If the hardware provides enough cores, the working thread (the "business payload") will not be affected by the activities of the pool at all.如果硬件提供了足够多的内核,工作线程(“业务负载”)将完全不会受到池活动的影响。

Other connection pools are implemented to create a new connection only on demand (a connection is requested, but currently there is none available in the pool) and within the thread of the caller.其他连接池被实现为仅在需要时(请求连接,但当前池中没有可用的连接)并在调用者的线程内创建新连接。 In this case, the creation of that connection reduces the performance of the working thread – this time!在这种情况下,该连接的创建会降低工作线程的性能——这一次! It should not happen too often, otherwise your application needs too many connections and/or does not return them fast enough.它不应该经常发生,否则您的应用程序需要太多的连接和/或不能足够快地返回它们。

But whether you really need a Database Connection Pool at all depends from the kind of your application!但是您是否真的需要数据库连接池取决于您的应用程序的类型!

If we talk about a typical server application that is intended to run forever and to serve a permanently changing crowd of multiple clients at the same time, it will definitely benefit from a connection pool.如果我们谈论一个典型的服务器应用程序,它旨在永远运行并同时为不断变化的多个客户端提供服务,那么它肯定会从连接池中受益。

If we talk about a tool type application that starts, performs a more or less linear task in a defined amount of time, and terminates when done, then using a connection pool for the database connection(s) may cause more overhead than it provides advantages.如果我们谈论一个工具类型的应用程序,它在定义的时间内启动、执行或多或少的线性任务,并在完成后终止,那么为数据库连接使用连接池可能会导致更多的开销而不是它提供的优势. For such an application it might be better to keep the connection open for the whole runtime.对于这样的应用程序,最好整个运行时保持连接打开。

Taking the RDBMS view, both does not make a difference: in both cases the connections are seen as open.从 RDBMS 的角度来看,两者都没有区别:在这两种情况下,连接都被视为打开的。

If you have performance as a key parameter then better to switch to the Hikari connection pool.如果您将性能作为关键参数,那么最好切换到 Hikari 连接池。 If you are using spring-boot then by default Hikari connection pool is used and you do not need to add any dependency.如果您使用的是 spring-boot,那么默认情况下使用 Hikari 连接池,您不需要添加任何依赖项。 The beautiful thing about the Hikari connection pool is its entire lifecycle is managed and you do not have to do anything. Hikari 连接池的美妙之处在于它的整个生命周期都是受管理的,您无需执行任何操作。 Also, it is always recommended to close the connection and let it return to the connection pool so that other threads can use it, especially in multi-tenant environments.此外,始终建议关闭连接并让它返回连接池,以便其他线程可以使用它,尤其是在多租户环境中。 The best way to do this is using "try with resources" and that connection is always closed.最好的方法是使用“尝试资源”并且该连接始终关闭。

    try(Connection con = datasource.getConnection()){
         // your code here.

        } 

To create your data source you can pass the credentials and create your data source for example:要创建数据源,您可以传递凭据并创建数据源,例如:

    DataSource dataSource =  DataSourceBuilder.create()
            .driverClassName(JDBC_DRIVER)
            .url(url)
            .username(username)
            .password(password)
            .build();

Link: https://github.com/brettwooldridge/HikariCP链接: https ://github.com/brettwooldridge/HikariCP

If you want to know the answer in your case, just write two implementations (one with a pool, one without) and benchmark the difference.如果您想知道您的情况的答案,只需编写两个实现(一个有池,一个没有)并基准测试差异。

Exactly how costly it is, depends on so many factors that it is hard to tell without measuring到底有多贵,取决于很多因素,如果不测量就很难判断

But in general, a pool will be more efficient.但总的来说,池会更有效率。

The costly is always a definition of impact.代价高昂始终是影响的定义。

Consider, you have following environment.考虑一下,您有以下环境。

A web application with assuming a UI-transaction (user click) and causes a thread on the webserver.假设 UI 事务(用户单击)并在 Web 服务器上引发线程的 Web 应用程序。 This thread is coupled to one connection/thread on the database该线程与数据库上的一个连接/线程耦合

  • 10 connections per 60000ms / 1min or better to say 0.167 connections/s每 60000 毫秒 / 1 分钟 10 个连接或更好地说 0.167 连接/秒
  • 10 connections per 1000ms / 1sec => 10 connections/s每 1000 毫秒/1 秒 10 个连接 => 10 个连接/秒
  • 10 connections per 100ms / 0.1sec => 100 connections/s每 100 毫秒/0.1 秒 10 个连接 => 100 个连接/秒
  • 10 connections per 10ms / 0.01sec => 1000 connections/s每 10 毫秒/0.01 秒 10 个连接 => 1000 个连接/秒

I have worked in even bigger environments.我曾在更大的环境中工作过。 And believe me the more you exceed the 100 conn/s by 10^x factors the more pain you will feel without having a clean connection pool.相信我,你越是超过 100 conn/s 10^x 个因子,在没有干净的连接池的情况下你会感到越痛苦。 The more connections you generate in 1 second the higher latency you generate and the higher impact is it for the database.您在 1 秒内生成的连接越多,生成的延迟就越高,对数据库的影响就越大。 And the more bandwidth you will eat for recreating over and over a new "water pipeline" for dropping a few drops of water from one side to the other side.而且,您将消耗更多的带宽来一遍又一遍地重建一条新的“水管道”,以便将几滴水从一侧滴到另一侧。

Now getting back, if you have to access a existing connection from a connection pool it is a matter of micros or few ms to access the database connection.现在回过头来,如果您必须从连接池访问现有连接,访问数据库连接只需几微秒或几毫秒。 So considering one, it is no real impact at all.因此,考虑到一个,它根本没有真正的影响。 If you have a network in between, it will grow to probably x 10¹ to x 10² ms to create a new connection.如果您之间有网络,它可能会增长到 x 10¹ 到 x 10² 毫秒以创建新连接。 Considering now the impact on your webserver, that each user blocks a thread, memory and network connection it will impact also your webserver load.现在考虑到对您的网络服务器的影响,每个用户都会阻塞线程、内存和网络连接,这也会影响您的网络服务器负载。 Typically you run into webserver (eg revProxy apache + tomcat, or tomcat alone) thread pools issues on high load environments, if the connections get exhausted or they need too long time (10¹, 10² millis) to create通常,如果连接耗尽或需要太长时间(10¹、10² 毫秒)来创建,您会在高负载环境中遇到网络服务器(例如 revProxy apache + tomcat 或单独的 tomcat)线程池问题

Now considering also the database.现在还考虑数据库。

If you have open connection, each connection is typically mapped to a thread on a DB.如果您有打开的连接,则每个连接通常都映射到数据库上的一个线程。 So the DB can use thread based caches to make prepared statements and to reuse pre-calculated access plan to make the accesses to data on database very fast.因此数据库可以使用基于线程的缓存来制作准备好的语句并重用预先计算的访问计划,以使对数据库数据的访问非常快。 You may loose this option if you have to recreate the connection over and over again.如果您必须一遍又一遍地重新创建连接,您可能会放弃此选项。

But as said, if you are in up to 10 connections per second you shall not face any bigger issue without a connection pool, except the first additional delay to access the DB.但是如上所述,如果您每秒最多有 10 个连接,那么没有连接池,您不会面临任何更大的问题,除了访问数据库的第一个额外延迟。 If you get into higher levels, you will have to manage the resources better and to avoid any useless IO-delay like recreating the connection.如果您进入更高级别,您将必须更好地管理资源并避免任何无用的 IO 延迟,例如重新创建连接。

Experience hints:经验提示:

it does not cost you anything to use a connection pool.使用连接池不需要任何费用。 If you have issues with the connection pool, in all my previous performance tuning projects it was a matter of bad configuration.如果您对连接池有疑问,那么在我之前的所有性能调优项目中,这都是配置错误的问题。

You can configure你可以配置

  • a connection check to check the connection (use a real SQL to access a real db field).连接检查以检查连接(使用真实的 SQL 访问真实的数据库字段)。 so on every new access the connection gets checked and if defective it gets kicked from the connection pool因此,在每次新访问时,都会检查连接,如果有缺陷,则将其从连接池中踢出
  • you can define a lifetime of a connections, so that you get new connection after a defined time您可以定义连接的生命周期,以便在定义的时间后获得新连接

=> all this together ensure that even if your admins are doing crap and do not inform you (killing connection / threads on DB) the pool gets quickly rebuilt and the impact stays very low. =>所有这些一起确保即使您的管理员在做废话并且不通知您(杀死数据库上的连接/线程),池也可以快速重建并且影响保持非常低。 Read the docs of the connection pool.阅读连接池的文档。

Is one connection pool better as the other?一个连接池比另一个更好吗?

A clear no, it is only getting a matter if you get into high end, or into distributed environments/clusters or into cloud based environments.一个明确的不,只有进入高端、分布式环境/集群或基于云的环境才有意义。 If you have one connection pool already and it is still maintained, stick to it and become a pro on your connection pool settings.如果您已经有一个连接池并且它仍在维护,请坚持使用它并成为连接池设置的专家。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM