简体   繁体   中英

ElasticSearch Scroll api issue

I am trying to use the scroll API in Python, and i having an issue for it looping through my whole dataset.

I get about 100 results returned when there should be over 150k of them (I can see them in kibana)

attached is my code

res = helpers.scan(client = es, scroll = '2m', query = {
      "size": 10000,
        "query": {
          "match": {
            "type": {
              "query": "IP_Address"
            }}}}, 
    index = "logstash-*")

# function to return hits from the elasticsearch query in res

def get_es_json(es_scan):
    for hits in es_scan:
        return hits

# iterate through results with defined number of results

def return_es_results(es_json_data, num_results):
    for i in range(num_results):
        data = get_es_json(es_json_data)
        print(data['_source']['geoip']['asn'])

return_es_results(res, 100)

is it because your call is "return_es_results(res, 100)" <--- note the 100 in your call.

so it loop until 100... you ask only 100 results!

you may want to paginate? if you use django there's some documentation here about pagination: https://docs.djangoproject.com/en/2.2/topics/pagination/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM