簡體   English   中英

在Elasticsearch中對大量文檔進行排序

[英]Sort a Huge Number of Documents in Elasticsearch

當我想從elasticsearch索引中檢索大量文檔時,我總是使用elasticsearch的掃描和滾動技術( http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/scan-scroll.html ) 如下:

conn = Elasticsearch( hosts = HOSTS )

the_query = { 'query': { 'match_all': {  } }, 'sort': { 'created_at': { 'order': 'asc' } } } # would like sort the documents according to the 'created_at' date

scanResp = conn.search( index=TARGET_INDEX, doc_type=TARGET_DOC_TYPE, body=the_query, search_type='scan', scroll='10m' )
scrollId = scanResp['_scroll_id']
doc_num = 1

response = conn.scroll( scroll_id = scrollId, scroll='10m' )

while ( len( response['hits']['hits'] ) > 0 ):
    for item in response['hits']['hits']:
        print '\tDocument ' + str(doc_num) + ' of ' + str( response['hits']['total'] )
        doc_num += 1

        # ====================
        #   Process the item
        # ====================
        the_doc = item['_source']


    # end for item
    scrollId = response['_scroll_id']
    if doc_num >= response['hits']['total']:
        break
    response = conn.scroll( scroll_id = scrollId, scroll='10m' )
# end of while

但是,正如elasticsearch文檔中提到的那樣,檢索到的文檔將不會排序,因此結果不是我想要的。

我的問題:如何在Elasticsearch中排序大量文檔?

謝謝 :)

遍歷排序列表時滾動非常昂貴,但是如果您堅持要從查詢中刪除“ scan” search_type。 滾動時,scan禁用排序。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM