[英]How to delete documents from Elasticsearch
I can't find any example of deleting documents from Elasticsearch
in Python.我找不到任何在 Python 中从Elasticsearch
中删除文档的示例。 What I've seen by now - is definition of delete
and delete_by_query
functions.我现在看到的是delete
和delete_by_query
函数的定义。 But for some reason documentation does not provide even a microscopic example of using these functions.但是由于某种原因, 文档甚至没有提供使用这些功能的微观示例。 The single list of parameters does not tell me too much, if I do not know how to correctly feed them into the function call.如果我不知道如何正确地将它们输入到函数调用中,那么单个参数列表并不能告诉我太多。 So, lets say, I've just inserted one new doc like so:所以,可以说,我刚刚插入了一个新文档,如下所示:
doc = {'name':'Jacobian'}
db.index(index="reestr",doc_type="some_type",body=doc)
Who in the world knows how can I now delete this document using delete
and delete_by_query
?世界上谁知道我现在如何使用delete
和delete_by_query
删除此文档?
Since you are not giving a document id while indexing your document, you have to get the auto-generated document id from the return value and delete according to the id.由于您在索引文档时没有提供文档 ID,因此您必须从返回值中获取自动生成的文档 ID 并根据 ID 删除。 Or you can define the id yourself, try the following:或者您可以自己定义 id,尝试以下操作:
db.index(index="reestr",doc_type="some_type",id=1919, body=doc)
db.delete(index="reestr",doc_type="some_type",id=1919)
In the other case, you need to look into return value;在另一种情况下,您需要查看返回值;
r = db.index(index="reestr",doc_type="some_type", body=doc)
# r = {u'_type': u'some_type', u'_id': u'AU36zuFq-fzpr_HkJSkT', u'created': True, u'_version': 1, u'_index': u'reestr'}
db.delete(index="reestr",doc_type="some_type",id=r['_id'])
Another example for delete_by_query. delete_by_query 的另一个示例。 Let's say after adding several documents with name='Jacobian', run the following to delete all documents with name='Jacobian':假设在添加了几个 name='Jacobian' 的文档后,运行以下命令删除所有 name='Jacobian' 的文档:
db.delete_by_query(index='reestr',doc_type='some_type', q={'name': 'Jacobian'})
The Delete-By-Query API was removed from the ES core in version 2 for several reasons.出于多种原因,Delete-By-Query API 在版本 2 中从 ES 核心中移除。 This function became a plugin.这个函数变成了一个插件。 You can look for more details here:您可以在此处查看更多详细信息:
Why Delete-By-Query is a plugin 为什么 Delete-By-Query 是一个插件
Delete By Query Plugin 按查询删除插件
Because I didn't want to add another dependency (because I need this later to run in a docker image) I wrote an own function solving this problem.因为我不想添加另一个依赖项(因为我稍后需要在 docker 映像中运行它)我编写了一个自己的函数来解决这个问题。 My solution is to search for all quotes with the specified index and type.我的解决方案是搜索具有指定索引和类型的所有引号。 After that I remove them using the Bulk API:之后,我使用批量 API 删除它们:
def delete_es_type(es, index, type_):
try:
count = es.count(index, type_)['count']
response = es.search(
index=index,
filter_path=["hits.hits._id"],
body={"size": count, "query": {"filtered" : {"filter" : {
"type" : {"value": type_ }}}}})
ids = [x["_id"] for x in response["hits"]["hits"]]
if len(ids) > 0:
return
bulk_body = [
'{{"delete": {{"_index": "{}", "_type": "{}", "_id": "{}"}}}}'
.format(index, type_, x) for x in ids]
es.bulk('\n'.join(bulk_body))
# es.indices.flush_synced([index])
except elasticsearch.exceptions.TransportError as ex:
print("Elasticsearch error: " + ex.error)
raise ex
I hope that helps future googlers ;)我希望这对未来的谷歌人有所帮助;)
One can also do something like this:也可以这样做:
def delete_by_ids(index, ids):
query = {"query": {"terms": {"_id": ids}}}
res = es.delete_by_query(index=index, body=query)
pprint(res)
# Pass index and list of id that you want to delete.
delete_by_ids('my_index', ['test1', 'test2', 'test3'])
Which will perform the delete operation on bulk data它将对批量数据执行删除操作
I came across this post while searching for a way to delete a document on ElasticSearch using their Python library, ElasticSearch-DSL.我在寻找使用他们的 Python 库 ElasticSearch-DSL 删除 ElasticSearch 上的文档的方法时遇到了这篇文章。
In case it helps anyone, this part of their documentation describes the document lifecycle.如果它对任何人有帮助,他们文档的这一部分描述了文档生命周期。 https://elasticsearch-dsl.readthedocs.io/en/latest/persistence.html#document-life-cycle https://elasticsearch-dsl.readthedocs.io/en/latest/persistence.html#document-life-cycle
And at the end of it, it details how to delete a document:最后,它详细说明了如何删除文档:
To delete a document just call its delete method:要删除一个文档,只需调用它的 delete 方法:
first = Post.get(id=42) first.delete()
Hope that helps 🤞希望对你有帮助🤞
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.