[英]Non-blocking vs blocking Java server with JDBC calls
Our gRPC need to handle 1000 QPS and each request requires a list of sequential operations to happen, including one which is to read data from the DB using JDBC. Handling a single request takes at most 50ms.我们的 gRPC 需要处理 1000 QPS,每个请求都需要一系列顺序操作,包括使用 JDBC 从数据库读取数据。处理单个请求最多需要 50 毫秒。
Our application can be written in two ways:我们的应用程序可以用两种方式编写:
My understanding is that non-blocking approach has these pros and cons:我的理解是非阻塞方法具有以下优点和缺点:
Question 1: Even though non blocking application seems to be the new cool thing, my understanding is that for an application that aren't memory bounded and where creating more threads isn't a problem, it's not clear that writing a non-blocking application is actually more CPU efficient than writing blocking application.问题 1:尽管非阻塞应用程序似乎是很酷的新事物,但我的理解是,对于不受 memory 限制且创建更多线程不是问题的应用程序,编写非阻塞应用程序尚不清楚实际上比编写阻塞应用程序的 CPU 效率更高。 Is there any reason to believe otherwise?
有任何理由不相信吗?
Question 2: My understanding is also that if we use JDBC, the connection is actually blocking and even if we make the rest of our application to be non-blocking, because of the JDBC client we lose all the benefit and in that case a Option 1 is most likely better?问题 2:我的理解也是,如果我们使用 JDBC,连接实际上是阻塞的,即使我们将应用程序的 rest 设为非阻塞,由于 JDBC 客户端,我们失去了所有好处,在这种情况下,一个选项1 最有可能更好?
For question 1, you are correct -- non-blocking is not inherently better (and with the arrival of Virtual Threads , it's about to become a lot worse in comparison to good old thread-per-request).对于问题 1,你是对的——非阻塞并不是天生就更好(随着虚拟线程的到来,与良好的旧线程请求相比,它将会变得更糟)。 At best, you could look at the tools you are working with and do some performance testing with a small scale example.
充其量,您可以查看正在使用的工具,并使用小规模示例进行一些性能测试。 But frankly, that is down to the tool, not the strategy (at least, until Virtual Threads get here).
但坦率地说,这取决于工具,而不是策略(至少,在虚拟线程出现之前)。
For question 2, I would strongly encourage you to choose the solution that works best with your tool/framework.对于问题 2,我强烈建议您选择最适合您的工具/框架的解决方案。 Staying within your ecosystem will allow you to make more flexible moves when the time comes to optimize.
待在您的生态系统内将使您能够在需要优化时采取更灵活的行动。
But all things equal, I would strongly encourage you to stick with thread-per-request, since you are working with Java. Ignoring Virtual Threads, thread-per-request allows you to work with and manage simple, blocking, synchronous code.但在所有条件相同的情况下,我强烈建议您坚持使用 thread-per-request,因为您正在使用 Java。忽略虚拟线程,thread-per-request 允许您使用和管理简单的、阻塞的、同步的代码。 You don't have to deal with callbacks or tracing the logic through confusing and piecemeal logs.
您不必处理回调或通过混乱和零碎的日志来跟踪逻辑。 Simply make a thread per request, let it block where it does, and then let your scheduler handle which thread should have the CPU core at any given time.
简单地为每个请求创建一个线程,让它在它阻塞的地方阻塞,然后让你的调度程序在任何给定时间处理哪个线程应该拥有 CPU 核心。
Pros: Save some overhead on the OS since it doesn't need to give CPU time to the thread waiting for IO
优点:节省操作系统的一些开销,因为它不需要为等待 IO 的线程提供 CPU 时间
It's not just the CPU time for waiting threads, but also the overhead of switching between threads competing for the CPU.不仅仅是等待线程的CPU时间,还有在线程之间切换竞争CPU的开销。 As you have more threads, more of them will be in a running state, and the CPU time must be spread between them.
当你有更多的线程时,更多的线程将在 state 中运行,并且 CPU 时间必须在它们之间分配。 This requires a lot of memory management for switching.
这就需要大量的memory管理进行切换。
Cons: For a large application (where each task is subscribing a callback to the previous task), it requires to split a single request to multiple threads creating a different kind of overhead.
缺点:对于大型应用程序(其中每个任务都订阅对前一个任务的回调),它需要将单个请求拆分到多个线程,从而产生不同类型的开销。 And potentially if a same request gets executed on multiple physical core, it adds overhead as data might not be available in L1/L2 core cache.
如果同一请求在多个物理内核上执行,则可能会增加开销,因为数据可能在 L1/L2 内核缓存中不可用。
This also happens with the “classic” approach since blocking calls will cause the CPU to switch to a different thread, and, as stated before, the CPU will even have to switch between runnable threads to share the CPU time as their number increases.这也发生在“经典”方法中,因为阻塞调用将导致 CPU 切换到不同的线程,并且如前所述,CPU 甚至必须在可运行线程之间切换以随着线程数量的增加共享 CPU 时间。
Question 1 : […] for an application that aren't memory bounded and where creating more threads isn't a problem
问题 1 :[…] 对于不受 memory 限制且创建更多线程不是问题的应用程序
In the current state of Java, creating more threads is always going to be a problem at some point.在Java的当前state中,创建更多线程总是会在某些时候成为问题。 With the thread-per-request model, it depends how many requests you have in parallel.
使用 thread-per-request model,这取决于您有多少个并行请求。 1000, probably ok, 10000… maybe not.
1000,可能还行,10000……也许不行。
it's not clear that writing a non-blocking application is actually more CPU efficient than writing blocking application.
目前尚不清楚编写非阻塞应用程序实际上是否比编写阻塞应用程序的 CPU 效率更高。 Is there any reason to believe otherwise?
有任何理由不相信吗?
It is not just a question of efficiency, but also scalability.这不仅仅是效率问题,还有可扩展性问题。 For the performance itself, this would require proper load testing.
对于性能本身,这需要进行适当的负载测试。 You may also want to check Is non-blocking I/O really faster than multi-threaded blocking I/O?
您可能还想检查非阻塞 I/O 真的比多线程阻塞 I/O 快吗? How?
如何?
Question 2 : My understanding is also that if we use JDBC, the connection is actually blocking and even if we make the rest of our application to be non-blocking, because of the JDBC client we lose all the benefit and in that case a Option 1 is most likely better?
问题 2 :我的理解也是,如果我们使用 JDBC,连接实际上是阻塞的,即使我们将应用程序的 rest 设为非阻塞,由于 JDBC 客户端,我们失去了所有好处,在这种情况下,一个选项1 最有可能更好?
JDBC is indeed a synchronous API. Oracle was working on ADBA as an asynchronous equivalent, but they discontinued it, considering that Project Loom will make it irrelevant. JDBC 确实是一个同步的 API。Oracle 作为异步等效项在 ADBA 上工作,但考虑到 Project Loom 将使它变得无关紧要,他们停止了它。 R2DBC provides an alternative which supports MySQL. Spring even supports reactive transactions .
R2DBC提供了一个支持 MySQL 的替代方案。Spring 甚至支持反应式事务。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.