简体   繁体   中英

How does elasticsearch handle the priority of different tasks?

Say an elasticsearch server receives 100 tasks in a very short period of time. Some tasks are short, some tasks are time-consuming, some tasks are deletion tasks, some are insertions and search queries. How does elasticsearch decide which to run first and how many tasks to execute concurrently.

Is there a task execution strategy on elasticsearch side or it just processes all tasks in a FIFO queue and allows some fixed number of tasks to run at the same time?

I wish ES has some task order optimizing features. Otherwise, we have to manually check the status of the tasks, set some timeout and do some retries, which is somewhat inconvenient.

Great question, as there is not enough documentation about the task execution priority , we can look at the source code of Elasticsearch to understand how it works.

First of all, Elasticsearch clearly explained that they maintain different threadpool to execute different types of tasks as explained in their official documentation .

Using above documentation following things are clear:

  1. They have different threadpools and queues (with different capacity) to execute different types of tasks like admin tasks, search tasks, index tasks etc
  2. Different threadpools enables Elasticsearch to executes the tasks parallel and avoids the starvation and can be helpful to scheduling/prioritising the tasks.

Now coming to scheduling/prioritising/optimising the execution of tasks, which is not very well explained in the documents, I looked into the Elasticsearch source code and found the Priority java class which is used at multiple places in the Elasticsearch code, to define the priority of a task, refer FrozenCacheService code that uses the LOW priority as updating this cache is not a too much of priority, on the other hand slowclusterStateProcessing uses the highest priority called IMMEDIATE in Elasticsearch.

You can also see this Priority enum is used is also used in PrioritizedEsThreadPoolExecutor which is again used to create the different threadpool explained in the begining of my post.

In-short, Elasticsearch does have the ordering of tasks based on the type of task to optimize the execution of tasks.

EDIT: Issues related to prioritising search queries https://github.com/elastic/elasticsearch/issues/37867 and some work on that direction https://github.com/elastic/elasticsearch/pull/57936

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM