简体   繁体   中英

How to insert an already created json-format string to Elasticsearch Bulk

In a python script,

I'm trying for elasticsearch.helpers.bulk to store multiple records.

I will get a json-format string from another software, and I want to attach it in the source part

I got the helpers.bulk format by this answer

part of my code:

def saveES(output,name):
    es = Elasticsearch([{'host':'localhost','port':9200}]) 
    output = output.split('\n')
    i=0
    datas=[]
    while i<len(output):
            data = {
                    "_index":"name",
                    "_type":"typed",
                    "_id":saveES.counter,
                    "_source":[[PROBLEM]]
            }
            i+=1
            saveES.counter+=1
            datas.append(data)

    helpers.bulk(es, datas)

I would like to attach a json-format string in [[PROBLEM]]

How can I attach it in? I have tried hard, but it is not output in the correct..

if I use:

"_source":{
"image_name":'"'+name+'",'+output[i]
}

and print data result is:

{'_type': 'typed', '_id': 0, '_source': {'image_name': '"nginx","features": "os,disk,package", "emit_shortname": "f0b03efe94ec", "timestamp": "2017-08-18T17:25:46+0900", "docker_image_tag": "latest"'}, '_index': 'name'}

This result show that combined into a single string.

but I expect:

{'_type': 'typed', '_id': 0, '_source': {'image_name': 'nginx','features': 'os,disk,package', 'emit_shortname': 'f0b03efe94ec', 'timestamp': '2017-08-18T17:25:46+0900', 'docker_image_tag': 'latest'}, '_index': 'name'}

There is many problems in your code.

  1. You override the value of data in your loop
  2. You don't respect any norms (Pesp8 and stuff)
  3. You are while instead of a comprehension list
  4. You created 2 useless variable
  5. You instantiate your es in your function

Here is your improved code

es = Elasticsearch([{'host':'localhost','port':9200}]) # You don't have to initialise this variable every time you are calling the function but only once.


def save_es(output,es):  # Peps8 convention
    output = output.split('\n') # you don't need a while loop. A comprehension loop will avoid a lot of trouble
    data = [    # Please without s in data
       {
          "_index": "name",
          "_type": "typed",
          "_id": index,
          "_source": {
              "image_name":"name" + name}
        }
        for index, name in enumerate(output)
    ]    
    helpers.bulk(es, data)

save_es(output, es)

Hope this help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM