简体   繁体   中英

Python requests.post does not force Elasticsearch to create missing index

I want to push data to my Elasticsearch server using :

requests.post('http://localhost:9200/_bulk', data=data_1 + data_2)

and it complains that the index does not exist. I try creating the index manually:

    curl -X PUT http://localhost:9200/_bulk

and it complains that I am not feeding a body to it:

    {"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"type":"parse_exception","reason":"request body is required"},"status":400}

Seems like a bit of a chicken and egg problem here. How can I create that _bulk index, and then post my data?

EDIT:

My data is very large to even understand the schema. Here is a small snippet:

'{"create":{"_index":"products-guys","_type":"t","_id":"0"}}\n{"url":"http://www.plaisio.gr/thleoraseis/tv/tileoraseis/LG-TV-43-43LH630V.htm","title":"TV LG 43\\" 43LH630V LED Full HD Smart","description":"\\u039a\\u03b1\\u03b9 \\u03cc\\u03bc\\u03bf\\u03c1\\u03c6\\u03b7 \\u03ba\\u03b1\\u03b9 \\u03ad\\u03be\\u03c5\\u03c0\\u03bd\\u03b7, \\u03bc\\u03b5 \\u03b9\\u03c3\\u03c7\\u03c5\\u03c1\\u03cc \\u03b5\\u03c0\\u03b5\\u03be\\u03b5\\u03c1\\u03b3\\u03b1\\u03c3\\u03c4\\u03ae \\u03b5\\u03b9\\u03ba\\u03cc\\u03bd\\u03b1\\u03c2 \\u03ba\\u03b1\\u03b9 \\u03bb\\u03b5\\u03b9\\u03c4\\u03bf\\u03c5\\u03c1\\u03b3\\u03b9\\u03ba\\u03cc webOS 3.0 \\u03b5\\u03af\\u03bd\\u03b1\\u03b9 \\u03b7 \\u03c4\\u03b7\\u03bb\\u03b5\\u03cc\\u03c1\\u03b1\\u03c3\\u03b7 \\u03c0\\u03bf\\u03c5 \\u03c0\\u03ac\\u03b5\\u03b9 \\u03c3\\u03c4\\u03bf \\u03c3\\u03b1\\u03bb\\u03cc\\u03bd\\u03b9 \\u03c3\\u03bf\\u03c5","priceCurrency":"EUR","price":369.0}\n{"create":{"_index":"products-guys","_type":"t","_id":"1"}}\n{"url":"http://www.plaisio.gr/thleoraseis/tv/tileoraseis/Samsung-TV-43-UE43M5502.htm","title":"TV Samsung 43\\" UE43M5502 LED ...

This is essentially someone else's code, that I need to make work. It seems that the "data" object I am passing to the PUT method is a string.

When I use requests.post('http://localhost:9200/_bulk', data=data)

I get <Response [406]> ,

If you want to do a bulk request using requests

response = requests.post('http://localhost:9200/_bulk', data= data=data_1 + data_2, headers={'content-type':'application/json', 'charset':'UTF-8'}) 

Old Answer

I recommend using the bulk helper from the python library

from elasticsearch import Elasticsearch, helpers

client = Elasticsearch("localhost:9200")

def gendata():
    mywords = ['foo', 'bar', 'baz']
    for word in mywords:
        yield {
            "_index": "mywords",
            "word": word,
        }

resp = helpers.bulk(
client,
gendata(),
index = "some_index",
)

If you didn't touch the elasticsearch configuration a new index will be created on document indexing.

About the ways you tried:

  1. Probably the query is malformed. To do bulk ingest the body shape is different than just sending the docs as a json array.

  2. You are doing PUT instead of post and you have to specify the documents you want to ingest.

There is no need to create the empty index first. Just in case you want to do you can just do :

curl -X PUT http://localhost:9200/index_name

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM