简体   繁体   English

Python requests.post 不会强制 Elasticsearch 创建缺失的索引

[英]Python requests.post does not force Elasticsearch to create missing index

I want to push data to my Elasticsearch server using :我想使用以下命令将数据推送到我的 Elasticsearch 服务器:

requests.post('http://localhost:9200/_bulk', data=data_1 + data_2)

and it complains that the index does not exist.它抱怨该索引不存在。 I try creating the index manually:我尝试手动创建索引:

    curl -X PUT http://localhost:9200/_bulk

and it complains that I am not feeding a body to it:它抱怨我没有给它喂食:

    {"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"type":"parse_exception","reason":"request body is required"},"status":400}

Seems like a bit of a chicken and egg problem here.这里似乎有点鸡和蛋的问题。 How can I create that _bulk index, and then post my data?如何创建该_bulk索引,然后发布我的数据?

EDIT:编辑:

My data is very large to even understand the schema.我的数据非常大,甚至无法理解模式。 Here is a small snippet:这是一个小片段:

'{"create":{"_index":"products-guys","_type":"t","_id":"0"}}\n{"url":"http://www.plaisio.gr/thleoraseis/tv/tileoraseis/LG-TV-43-43LH630V.htm","title":"TV LG 43\\" 43LH630V LED Full HD Smart","description":"\\u039a\\u03b1\\u03b9 \\u03cc\\u03bc\\u03bf\\u03c1\\u03c6\\u03b7 \\u03ba\\u03b1\\u03b9 \\u03ad\\u03be\\u03c5\\u03c0\\u03bd\\u03b7, \\u03bc\\u03b5 \\u03b9\\u03c3\\u03c7\\u03c5\\u03c1\\u03cc \\u03b5\\u03c0\\u03b5\\u03be\\u03b5\\u03c1\\u03b3\\u03b1\\u03c3\\u03c4\\u03ae \\u03b5\\u03b9\\u03ba\\u03cc\\u03bd\\u03b1\\u03c2 \\u03ba\\u03b1\\u03b9 \\u03bb\\u03b5\\u03b9\\u03c4\\u03bf\\u03c5\\u03c1\\u03b3\\u03b9\\u03ba\\u03cc webOS 3.0 \\u03b5\\u03af\\u03bd\\u03b1\\u03b9 \\u03b7 \\u03c4\\u03b7\\u03bb\\u03b5\\u03cc\\u03c1\\u03b1\\u03c3\\u03b7 \\u03c0\\u03bf\\u03c5 \\u03c0\\u03ac\\u03b5\\u03b9 \\u03c3\\u03c4\\u03bf \\u03c3\\u03b1\\u03bb\\u03cc\\u03bd\\u03b9 \\u03c3\\u03bf\\u03c5","priceCurrency":"EUR","price":369.0}\n{"create":{"_index":"products-guys","_type":"t","_id":"1"}}\n{"url":"http://www.plaisio.gr/thleoraseis/tv/tileoraseis/Samsung-TV-43-UE43M5502.htm","title":"TV Samsung 43\\" UE43M5502 LED ...

This is essentially someone else's code, that I need to make work.这本质上是别人的代码,我需要工作。 It seems that the "data" object I am passing to the PUT method is a string.我传递给 PUT 方法的“数据”对象似乎是一个字符串。

When I use requests.post('http://localhost:9200/_bulk', data=data)当我使用requests.post('http://localhost:9200/_bulk', data=data)

I get <Response [406]> ,我得到<Response [406]>

If you want to do a bulk request using requests如果您想使用requests进行批量requests

response = requests.post('http://localhost:9200/_bulk', data= data=data_1 + data_2, headers={'content-type':'application/json', 'charset':'UTF-8'}) 

Old Answer旧答案

I recommend using the bulk helper from the python library我建议使用 python 库中的批量助手

from elasticsearch import Elasticsearch, helpers

client = Elasticsearch("localhost:9200")

def gendata():
    mywords = ['foo', 'bar', 'baz']
    for word in mywords:
        yield {
            "_index": "mywords",
            "word": word,
        }

resp = helpers.bulk(
client,
gendata(),
index = "some_index",
)

If you didn't touch the elasticsearch configuration a new index will be created on document indexing.如果您没有触及 elasticsearch 配置,则会在文档索引上创建一个新索引。

About the ways you tried:关于你尝试的方法:

  1. Probably the query is malformed.可能查询格式不正确。 To do bulk ingest the body shape is different than just sending the docs as a json array.批量摄取身体形状与将文档作为 json 数组发送不同。

  2. You are doing PUT instead of post and you have to specify the documents you want to ingest.您正在执行 PUT 而不是发布,您必须指定要摄取的文档。

There is no need to create the empty index first.无需先创建空索引。 Just in case you want to do you can just do :万一你想做,你可以这样做:

curl -X PUT http://localhost:9200/index_name

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM