简体   繁体   English

elasticsearch如何处理不同任务的优先级?

[英]How does elasticsearch handle the priority of different tasks?

Say an elasticsearch server receives 100 tasks in a very short period of time.假设 elasticsearch 服务器在很短的时间内收到 100 个任务。 Some tasks are short, some tasks are time-consuming, some tasks are deletion tasks, some are insertions and search queries.有些任务很短,有些任务很耗时,有些任务是删除任务,有些是插入和搜索查询。 How does elasticsearch decide which to run first and how many tasks to execute concurrently. elasticsearch如何决定先运行哪个,并发执行多少个任务。

Is there a task execution strategy on elasticsearch side or it just processes all tasks in a FIFO queue and allows some fixed number of tasks to run at the same time? elasticsearch 端是否有任务执行策略,或者它只是处理 FIFO 队列中的所有任务并允许同时运行一些固定数量的任务?

I wish ES has some task order optimizing features.我希望 ES 有一些任务顺序优化功能。 Otherwise, we have to manually check the status of the tasks, set some timeout and do some retries, which is somewhat inconvenient.否则,我们必须手动检查任务的状态,设置一些超时并进行一些重试,这有些不方便。

Great question, as there is not enough documentation about the task execution priority , we can look at the source code of Elasticsearch to understand how it works.很好的问题,由于没有足够的关于任务执行优先级的文档,我们可以查看 Elasticsearch 的源代码来了解它是如何工作的。

First of all, Elasticsearch clearly explained that they maintain different threadpool to execute different types of tasks as explained in their official documentation .首先,Elasticsearch在他们的官方文档中明确解释了他们维护不同的线程池来执行不同类型的任务。

Using above documentation following things are clear:使用上面的文档,以下事情很清楚:

  1. They have different threadpools and queues (with different capacity) to execute different types of tasks like admin tasks, search tasks, index tasks etc它们有不同的线程池和队列(具有不同的容量)来执行不同类型的任务,如管理任务、搜索任务、索引任务等
  2. Different threadpools enables Elasticsearch to executes the tasks parallel and avoids the starvation and can be helpful to scheduling/prioritising the tasks.不同的线程池使 Elasticsearch 能够并行执行任务并避免饥饿,并且有助于安排任务/确定任务的优先级。

Now coming to scheduling/prioritising/optimising the execution of tasks, which is not very well explained in the documents, I looked into the Elasticsearch source code and found the Priority java class which is used at multiple places in the Elasticsearch code, to define the priority of a task, refer FrozenCacheService code that uses the LOW priority as updating this cache is not a too much of priority, on the other hand slowclusterStateProcessing uses the highest priority called IMMEDIATE in Elasticsearch.现在开始调度/优先级/优化任务的执行,这在文档中没有很好地解释,我查看了 Elasticsearch 源代码并找到了Priority java class它在 Elasticsearch 代码的多个地方使用,定义任务的优先级,请参考使用LOW优先级的FrozenCacheService 代码,因为更新此缓存的优先级不是太高,另一方面, slowclusterStateProcessing使用最高优先级,称为 Elasticsearch 中的IMMEDIATE

You can also see this Priority enum is used is also used in PrioritizedEsThreadPoolExecutor which is again used to create the different threadpool explained in the begining of my post.您还可以看到这个 Priority 枚举也用在PrioritizedEsThreadPoolExecutor中,它再次用于创建在我的帖子开头解释的不同线程池。

In-short, Elasticsearch does have the ordering of tasks based on the type of task to optimize the execution of tasks.简而言之,Elasticsearch 确实有根据任务类型对任务进行排序,以优化任务的执行。

EDIT: Issues related to prioritising search queries https://github.com/elastic/elasticsearch/issues/37867 and some work on that direction https://github.com/elastic/elasticsearch/pull/57936编辑:与优先搜索查询相关的问题https://github.com/elastic/elasticsearch/issues/37867以及该方向的一些工作https://github.com/elastic/elasticsearch/pull/57936

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM