简体繁体 English

Java REST Web服务中的线程

[英]Threading in Java REST web service

原文 2012-03-09 10:51:25 5 2 java/ multithreading/ web-services/ rest

I have a REST web service which is consuming too many CPU resources when the number of requests gets too high. 我有一个REST Web服务，当请求数量过多时，它会消耗过多的CPU资源。

This is believed to be caused by a while() loop in the response generation, which normally takes a few milliseconds to complete, however under some circumstances it can take a few seconds. 据信这是由响应生成中的while（）循环引起的，通常需要花费几毫秒的时间才能完成，但是在某些情况下则可能需要花费几秒钟。

The fix for this appears to be to use wait() and notify() according to this however I don't understand why this will reduce CPU usage? 解决此问题的方法似乎是根据此方法使用wait（）和notify（），但是我不明白为什么这会减少CPU使用率？

Would this new thread be handled outside the web service, thereby freeing it up to handle more requests? 这个新线程是否将在Web服务之外处理，从而释放它以处理更多请求？ Can someone please explain? 有人可以解释一下吗？

Thanks! 谢谢！

Jon. 乔恩

Edit: 编辑：

I may have found my own answer here 我可能在这里找到了自己的答案

It seems that my code of result = get() constantly polls until there is a response, which consumes more CPU resources. 看来我的result = get()代码result = get()不断轮询直到出现响应为止，这会消耗更多的CPU资源。 By placing this in a thread, less resources are consumed. 通过将其放置在线程中，可以减少资源消耗。

Is this a correct understanding? 这是正确的理解吗？

2 个解决方案

Is this a correct understanding? 这是正确的理解吗？

A loop that continually polls something until there is a response is very wasteful. 不断轮询某些事物直到有响应的循环是非常浪费的。 If you add a sleep between each poll, you reduce the CPU usage, but at the cost of reduced responsiveness for individual requests ... compared to what is achievable if you do it the right way. 如果在每个轮询之间添加sleep ，则可以减少CPU使用率，但会以降低单个请求的响应为代价...与以正确方式进行操作相比，这是可以实现的。

Without knowing exactly what you are doing (what you are polling, and why) it is a bit difficult to say what the best solution is. 如果不确切知道自己在做什么（正在轮询什么以及为什么），则很难说出最佳解决方案是什么。 But here are a couple of possible scenarios: 但是，这里有两种可能的情况：

If your web service is waiting for a response from an external service, then the simple solution is to just do a blocking read, and configure your web server with more worker threads. 如果您的Web服务正在等待来自外部服务的响应，那么简单的解决方案是只进行阻塞读取，并为Web服务器配置更多的工作线程。
On the other hand, if your web service is waiting for a computation to complete, a new thread and wait / notify ... or one of the higher level synchronization classes ... may be the answer. 另一方面，如果您的Web服务正在等待计算完成，则可能是一个新线程并等待/通知...或更高级别的同步类之一。
If you need to handle a really large number of these blocking requests in parallel, that is going to require a lot of threads and hence a lot of memory, and other things. 如果您需要并行处理大量此类阻塞请求，则将需要大量线程，因此需要大量内存和其他东西。 In that case you need to consider an web container that breaks the one-thread-per-request constraint. 在这种情况下，您需要考虑一个打破每个请求一个线程约束的Web容器。 The latest version of the Servlet spec allows this, as do some of the alternative (non-Servlet) architectures. Servlet规范的最新版本以及某些其他（非Servlet）体系结构也允许这样做。

FOLLOW UP 跟进

... I think the issue is your point 2, that the service is simply just waiting for the computation. ...我认为问题在于您的第二点，即服务只是在等待计算。 So, by simply threading this computation will free up resources in the service? 那么，仅通过线程化此计算即可释放服务中的资源？

If what you are describing is true, then running the computation in a different thread won't make it go much quicker. 如果您所描述的是正确的，则在不同的线程中运行计算不会使计算更快。 In fact, it could make it go slower. 实际上，它可能会使速度变慢。

The ultimate bottleneck is going to be CPU capacity, disc bandwidth and / or network bandwidth. 最终的瓶颈将是CPU容量，磁盘带宽和/或网络带宽。 Multi-threading is only going to make an individual request go faster if you can effectively / efficiently use 2 or more processors on the same request at the same time. 如果您可以同时有效/高效地在同一请求上使用2个或多个处理器，那么多线程只会使单个请求的执行速度更快。 It will only make your throughput better to the extent that it allows requests to run while others are waiting for external events; 它只会在允许请求在其他人正在等待外部事件的同时运行的范围内提高吞吐量。 eg network responses to arrive, or file read/write operations to complete. 例如，网络响应到达，或文件读/写操作完成。

What I think you actually need to do is to figure out why the computation is taking so long and try and fix that: 我认为您实际上需要做的是弄清楚为什么计算需要这么长时间，然后尝试解决该问题：

Are your database queries inefficient? 您的数据库查询效率低下吗？
Are fetching result-sets that are too large? 获取结果集是否太大？
Do you have a problem with your schemas? 您的架构有问题吗？
Poor choice of indexes? 索引选择不正确？
Or are you simply trying to do too much on a machine that is too small using the wrong kind of database? 还是您只是在使用错误类型的数据库的小型计算机上尝试执行过多操作？

There are various techniques for measuring performance of an application service and a database to determine where the bottlenecks are. 有多种技术可用于衡量应用程序服务和数据库的性能以确定瓶颈所在。 (Start with a Google search ....) （从Google搜索开始...。）

Firstly, stop trying to guess why it's consuming so much CPU. 首先，不要试图猜测为什么要消耗这么多的CPU。 Put some instrumentation in place and find out where the bottle-neck really is. 放置一些仪器，找出瓶颈的真正位置。 Once you know where the bottle-neck is, try to to understand the root cause. 一旦知道瓶颈在哪里，就尝试了解根本原因。 5 whys is a good technique to use when doing this. 5个为什么要这样做的好方法。

Once you know the root cause, determine if it's solvable. 一旦知道了根本原因，请确定它是否可以解决。 eg if the algorithm is inefficient and there is an efficient algorithm is available then refactor the code to use the efficient algorithm. 例如，如果算法效率低下并且有可用的高效算法，则重构代码以使用高效算法。 Don't forget that the root cause might be the server is under-powered and needs a bigger CPU. 不要忘记，根本原因可能是服务器电源不足，需要更大的CPU。

Secondly, consider using a http cache to reduce the load on your server. 其次，考虑使用http缓存来减少服务器上的负载。 How frequently does the data requested change and how fresh do you responses need to be? 请求的数据多久更改一次，您的响应需要多长时间更新一次？ Use this to calculate a max-age; 用它来计算最大寿命； the length of time a HTTP cache could keep a response. HTTP缓存可以保持响应的时间长度。 It might be 1 min, a day, a week, etc. It really depends on what you are serving up. 可能是1分钟，一天，一周等等，这实际上取决于您要提供的服务。 Once you have a max-age, have a look at the requests coming in. How many are for the same URI (or could be for the same URI if you tweeked it) within the max-age you've picked. 达到最大使用期限后，请查看传入的请求。在您选择的最大使用期限内，有多少个请求用于相同的URI（或者如果您对它进行了周整，则可以用于同一URI）。 If all the request URIs within a max-age period are unique then caching won't help you. 如果在最大使用期限内所有请求URI都是唯一的，那么缓存将无济于事。 If you're getting on average 2 requests for each URI in a max-age period, then a HTTP cache will half the load on you server. 如果在最大使用期限内，每个URI平均收到2个请求，那么HTTP缓存将使服务器负载减半。 If it's 10 requests per URI, then a HTTP cache will reduce the server load by 90%. 如果每个URI 10个请求，那么HTTP缓存将减少90％的服务器负载。