简体繁体 English

批量生成http响应

[英]Batching generation of http responses

原文 2015-10-23 21:58:03 6 1 performance/ rest/ http/ architecture

I'm trying to find an architecture for the following scenario. 我正在尝试为以下情况找到一种体系结构。 I'm building a REST service that performs some computation that can be quickly batch computed. 我正在构建一个REST服务，该服务执行一些可以快速批量计算的计算。 Let's say that computing 1 "item" takes 50ms, and computing 100 "items" takes 60ms. 假设计算1个“项目”需要50毫秒，计算100个“项目”需要60毫秒。

However, the nature of the client is that only 1 item needs to be processed at a time. 但是，客户的性质是一次只需要处理一项。 So if I have 100 simultaneous clients, and I write the typical request handler that sends one item and generates a response, I'll end up using 5000ms, but I know I could compute the same in 60ms. 因此，如果我有100个并发客户端，并且编写了典型的请求处理程序来发送一项并生成响应，则最终将使用5000ms，但我知道我可以在60ms内计算出相同的值。

I'm trying to find an architecture that works well in this scenario. 我试图找到一种在这种情况下运行良好的体系结构。 Ie, I would like to have something that merges data from many independent requests, processes that batch, and generates the equivalent responses for each individual client. 即，我希望有一些东西可以合并来自许多独立请求的数据，进行批处理，并为每个客户端生成等效的响应。

If you're curious, the service in question is python+django+DRF based, but I'm curious about what kind of architectural solutions/patterns apply here and if anything solving this is already available. 如果您感到好奇，那么所涉及的服务是基于python + django + DRF的，但是我很好奇这里适用的是哪种架构解决方案/模式，以及是否有解决此问题的方法已经可用。

1 个解决方案

At first you could think of a reverse proxy detecting all pattern-specific queries, collecting all theses queries and sending it to your application in an HTTP 1.1 pipeline (pipelining is a way to send a big number of queries one after another and receiving all HTTP responses in the same order at the end, without waiting for a response after each query). 最初，您可以想到一个反向代理来检测所有特定于模式的查询，收集所有这些查询，然后通过HTTP 1.1 管道将其发送到您的应用程序中（流水线是一种互相发送大量查询并接收所有HTTP的方式。最后以相同的顺序回复，而无需在每次查询后都等待回复）。

But: 但：

Pipelining is very hard to do well 流水线很难做好
you would have to code the reverse proxy as I do not know a way to do it 您将不得不编写反向代理的代码，因为我不知道这样做的方法
one slow response in the pipeline block all the other responses 管道中的一个缓慢响应会阻止所有其他响应
you need an http server able to give several queries to your application language, something which never happens if the http server is not directly coded in your application, because usually http is made to work on only one query (like you never receive 2 queries in a PHP env, you receive the 1st one, send the response, and then receive the next one, even if the connection contain 2 queries). 您需要一台能够对您的应用程序语言进行多次查询的http服务器，如果未在应用程序中直接对该http服务器进行编码，则不会发生这种情况，因为通常使http只对一个查询起作用（例如您永远不会在其中收到2个查询）一个PHP环境，即使连接包含2个查询，您也会收到第一个，发送响应，然后接收下一个。

So the good idea would be to do that on the application side . 因此，好主意是在应用程序端执行此操作 。 You could identify matching queries, and wait for a small amount of time (10ms?) to see if some other queries are also incoming. 您可以识别匹配的查询，然后等待一小段时间（10毫秒？）以查看是否还有其他查询传入。 You will need a way to communicate between several parallel workers here (like you have 50 application workers and 10 of them have received queries that could be treated in the same batch). 您将需要一种在此处的多个并行工作器之间进行通信的方法（例如您有50个应用程序工作器，其中10个已收到可以在同一批中处理的查询）。 This way of communication could be a database (a very fast one) or some shared memory, depends on the technology used. 这种通信方式可以是数据库（非常快的一种）或某些共享内存，这取决于所使用的技术。

Then when too much time waiting has been spend (10ms?) or when a big amount of queries are received, one of the worker could collect all queries, run the batch, and tell every other workers that a result is there (here again you need a central point of communication, like LISTEN/NOTIFY in PostgreSQL, a shared memory thing, a message queue service, etc.). 然后，当等待时间花费过多（10毫秒？）或收到大量查询时，其中一个工作人员可以收集所有查询，运行批处理，并告诉其他工作人员结果在那里（在这里您再次需要一个通讯的中心点，例如PostgreSQL中的LISTEN / NOTIFY，共享内存，消息队列服务等）。

Finally every worker is responsible for sending the right HTTP response. 最后，每个工作人员都有责任发送正确的HTTP响应。

The key here is having a system where the time you loose in trying to share requests treatment is less important than the time saved in batching several queries together, and in case of low traffic this time should stay reasonnable (as here you will always loose time waiting for nothing). 这里的关键是要有一个系统，在该系统中，您尝试共享请求处理所花费的时间不如将多个查询一起分批节省的时间那么重要 ，并且在通信量较低的情况下，此时间应保持合理（因为在此情况下，您总是会浪费时间）等了等于白等）。 And of course you are also adding some complexity on the system, harder to maintain, etc. 当然，您还会增加系统的复杂性 ，维护难度等。