简体   繁体   English

如何从python进行批量索引到elasticsearch

[英]how to do bulk indexing to elasticsearch from python

I have nearly 10K json documents and i want to push all this documents to elasticsearch by using elasticsearch bulk api from python. 我有将近1万个json文档,我想通过使用来自python的elasticsearch bulk api将所有这些文档推送到elasticsearch。 I went through some docs but didn't get any solutions. 我浏览了一些文档,但没有任何解决方案。

result=es.bulk(index="index1", doc_type="index123", body=jsonvalue)
helpers.bulk(es,doc) 

i tried both but no result,i am getting this error 我都尝试了两次,但没有结果,我收到此错误

elasticsearch.exceptions.RequestError: TransportError(400, u'illegal_argument_exception', u'Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [VALUE_STRING]')

please help me 请帮我

I prefer using the bulk method present in helpers module for bulk indexing. 我更喜欢使用助手模块中存在的批量方法进行批量索引。 Try the following: 请尝试以下操作:

from elasticsearch import helpers
res = helpers.bulk(es, jsonvalue, chunk_size=1000, request_timeout=200)

Your jsonvalue needs to follow a particular format. 您的jsonvalue必须遵循特定格式。 It needs to be a list of the 10K json documents with each document having the following fields: 它必须是10K json文档的列表,每个文档具有以下字段:

doc = {
    '_index': 'your-index',
    '_type': 'your-type',
    '_id': 'your-id',
    'field_1': 'value_1',
    ...
}

So your final jsonvalue would look something like this: 因此,您最终的jsonvalue如下所示:

jsonvalue = [
    {
    '_index': 'your-index',
    '_type': 'your-type',
    '_id': 'your-id',
    'field_1': 'value_1',
    ...
},
    {
    '_index': 'your-index',
    '_type': 'your-type',
    '_id': 'your-id',
    'field_1': 'value_2',
    ...
},
    {
    '_index': 'your-index',
    '_type': 'your-type',
    '_id': 'your-id',
    'field_1': 'value_3',
    ...
}
]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM