I am trying to use the scroll API in Python, and i having an issue for it looping through my whole dataset.
I get about 100 results returned when there should be over 150k of them (I can see them in kibana)
attached is my code
res = helpers.scan(client = es, scroll = '2m', query = {
"size": 10000,
"query": {
"match": {
"type": {
"query": "IP_Address"
}}}},
index = "logstash-*")
# function to return hits from the elasticsearch query in res
def get_es_json(es_scan):
for hits in es_scan:
return hits
# iterate through results with defined number of results
def return_es_results(es_json_data, num_results):
for i in range(num_results):
data = get_es_json(es_json_data)
print(data['_source']['geoip']['asn'])
return_es_results(res, 100)
is it because your call is "return_es_results(res, 100)" <--- note the 100 in your call.
so it loop until 100... you ask only 100 results!
you may want to paginate? if you use django there's some documentation here about pagination: https://docs.djangoproject.com/en/2.2/topics/pagination/
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.