dump bulk data in elastic search using python api

Question

i want to index shakespeare data in elastic search using its python api. I am getting below error.

    PUT http://localhost:9200/shakes/play/3 [status:400 request:0.098s]
{'error': {'root_cause': [{'type': 'mapper_parsing_exception', 'reason': 'failed to parse'}], 'type': 'mapper_parsing_exception', 'reason': 'failed to parse', 'caused_by': {'type': 'not_x_content_exception', 'reason': 'Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes'}}, 'status': 400}

python script

from elasticsearch import Elasticsearch
from elasticsearch import TransportError
import json

data = []

for line in open('shakespeare.json', 'r'):
    data.append(json.loads(line))

es = Elasticsearch()

res = 0
cl = []
# filtering data which i need
for d in data:
    if res == 0:
        res = 1 
        continue
    cl.append(data[res])
    res = 0

try:
    res = es.index(index = "shakes", doc_type = "play", id = 3, body = cl)
    print(res)
except TransportError as e:
    print(e.info)

I also tried using json.dumps but still getting same error. But when add just one element of list to elastic search below code works.

Answer 1

You are not sending a bulk request to es, but only a simple create request -please take a look here . This method works with a dict that represent a new doc, and not with a list of docs. If you put an id on the create request, then you need to make this value dynamic, otherwise every doc will be overwritten on the id of the last doc indicized. If in your json , you have a record for each line you should try this -Please read here for bulk documentation:

  from elasticsearch import helpers

es = Elasticsearch()
op_list = []
with open("C:\ElasticSearch\shakespeare.json") as json_file:
    for record in json_file:
        op_list.append({
                       '_op_type': 'index',
                       '_index': 'shakes',
                       '_type': 'play',
                       '_source': record
                     })
helpers.bulk(client=es, actions=op_list)

dump bulk data in elastic search using python api

Question

1 answers

solution1
3 ACCPTED 2019-03-16 10:11:46

dump bulk data in elastic search using python api

Question

1 answers

solution1 3 ACCPTED 2019-03-16 10:11:46

solution1
3 ACCPTED 2019-03-16 10:11:46