[英]elasticsearch delete documents using logstash and csv
Is there any way to delete documents from ElasticSearch using Logstash and a csv file? 有什么方法可以使用Logstash和csv文件从ElasticSearch中删除文档? I read the Logstash documentation and found nothing and tried a few configs but nothing happened using action "delete"
我阅读了Logstash文档,却一无所获,并尝试了一些配置,但是使用操作“删除”却没有任何反应
output {
elasticsearch{
action => "delete"
host => "localhost"
index => "index_name"
document_id => "%{id}"
}
}
Has anyone tried this? 有人尝试过吗? Is there anything special that I should add to the input and filter sections of the config?
我应该在配置的输入和过滤器部分添加一些特殊的东西吗? I used file plugin for input and csv plugin for filter.
我使用文件插件作为输入,使用csv插件作为过滤器。
It is definitely possible to do what you suggest, but if you're using Logstash 1.5, you need to use the transport
protocol as there is a bug in Logstash 1.5 when doing delete
s over the HTTP protocol (see issue #195 ) 绝对可以按照您的建议去做,但是如果您使用的是Logstash 1.5,则需要使用
transport
协议,因为通过HTTP协议执行delete
时Logstash 1.5中存在一个错误(请参见问题#195 )。
So if your delete.csv
CSV file is formatted like this: 因此,如果您的
delete.csv
CSV文件格式如下:
id
12345
12346
12347
And your delete.conf
Logstash config looks like this: 您的
delete.conf
Logstash配置如下所示:
input {
file {
path => "/path/to/your/delete.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
columns => ["id"]
}
}
output {
elasticsearch{
action => "delete"
host => "localhost"
port => 9300 <--- make sure you have this
protocol => "transport" <--- make sure you have this
index => "your_index" <--- replace this
document_type => "your_doc_type" <--- replace this
document_id => "%{id}"
}
}
Then when running bin/logstash -f delete.conf
you'll be able to delete all the documents whose id is specified in your CSV file. 然后,在运行
bin/logstash -f delete.conf
您将能够删除ID在CSV文件中指定的所有文档。
In addition to Val's answer, I would add that if you have a single input that has a mix of deleted and upserted rows, you can do both if you have a flag that identifies the ones to delete. 除了Val的答案外,我还要补充一点:如果您有一个包含删除行和升序行混合输入的单个输入,那么如果您有一个标识要删除的行的标志,则可以同时执行这两个操作。 The
output > elasticsearch > action
parameter can be a "field reference," meaning that you can reference a per-row field. output > elasticsearch > action
参数可以是“字段引用”,这意味着您可以引用每行字段。 Even better, you can change that field to a metadata field so that it can be used in a field reference without being indexed. 更好的是,您可以将该字段更改为元数据字段,这样它就可以在字段引用中使用而无需编制索引。
For example, in your filter
section: 例如,在您的
filter
部分中:
filter {
# [deleted] is the name of your field
if [deleted] {
mutate {
add_field => {
"[@metadata][elasticsearch_action]" => "delete"
}
}
mutate {
remove_field => [ "deleted" ]
}
} else {
mutate {
add_field => {
"[@metadata][elasticsearch_action]" => "index"
}
}
mutate {
remove_field => [ "deleted" ]
}
}
}
Then, in your output section, reference the metadata field: 然后,在输出部分中,引用元数据字段:
output {
elasticsearch {
hosts => "localhost:9200"
index => "myindex"
action => "%{[@metadata][elasticsearch_action]}"
document_type => "mytype"
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.