简体   繁体   中英

ElasticSearch Nest 2.x - Performance issue, how to disable audit trail?

I'm running search queries using Elasticsearch client NEST 2. Queries are running fine but inspecting the response I can see that a huge time is spent auditing the query while the ES operation itself is done in a snap.

Here is an example of a request/response :

Succesful low level call on POST: /document/ElasticDocument/_search

Audit trail of this API call: - HealthyResponse: Node: http://my-ES-server.com:9200/ Took: 00:00:00.3040912
Request: {"from":0,"size":1,"query":{"term":{"Id":{"value":1568}}}}
Response: { "took":16 ,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":9.345218,"hits":[{"_index":"document","_type":"ElasticDocument","_id":"1568","_score":9.345218,"_source":{........}}]}}

We can see that the audit took 304ms while the ES search took only 16.

My question is, is there anyway to disable this audit trail or tweak the configuration to effectively improve the performance ?

I had a look at the source code and found out that the audit trail operation is triggered by the ElasticsearchResponse's property DebugInformation but couldn't find how to disable it.

My configuration is pretty straight forward :

var node = new Uri("http://my-ES-server.com:9200/");

var settings = new ConnectionSettings(node);

settings.DefaultIndex("document");
settings.DefaultTypeNameInferrer(p => p.Name);
settings.DefaultFieldNameInferrer(p => p);

settings.DisableDirectStreaming(); 

this.elasticClient = new ElasticClient(settings);

And then my nest call :

var response = this.elasticClient.Search<ElasticDocument>(s => s
    .Query(q => q.Term("Id", documentId))
    .From(0)
    .Take(1)
);

For information, when I'm running the queries against a local ES store (populated with same data) audit trail takes ~60ms which is better but still huge comparing to the ES search operation.

Many thanks,

Mickael

Audit Trail information is built when the .DebugInformation property is accessed .

took in the response is the time taken on the cluster for Elasticsearch to execute the search and doesn't include

  1. time to serialize the request
  2. network latency going to the selected Elasticsearch node
  3. network latency coming back from the selected Elasticsearch node
  4. time to deserialize the response

Contrastly, the Took time that you see in the audit trail is a rough (in the sense it is calculated with an IDateTimeProvider implementation that uses DateTime precision by default) total time between

  1. getting a node from the IConnectionPool
  2. serializing the request
  3. sending the request to the node
  4. time take on the Elasticsearch node
  5. receiving the response back from the node
  6. deserializing the response

Because of differences in network latency, the number reported in the audit trail could vary in different environments. Further, it's likely to be higher on the first request because of the initialization of caches used for member access by the json serializer.

As an aside, looking at your configuration, unless you need to access the request or response body bytes, do not call .DisableDirectStreaming() ; this copies the request and response to a MemoryStream internally in order to make the bytes available.

I had similar problem and it is not about Audit trail. The point is that NEST uses by default Accept: application/json header. When you set EnableHttpCompression() on your client, NEST starts to use Accept-Encoding: gzip, deflate header and that's correct.

This is my client initialization code:

var node = new Uri("http://localhost.:9200");
var settings = new ConnectionSettings(node)
    .EnableHttpCompression()
    .PrettyJson();
var client = new ElasticClient(settings);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM