elasticsearch使用logstash和csv删除文档

Question

Is there any way to delete documents from ElasticSearch using Logstash and a csv file? 有什么方法可以使用Logstash和csv文件从ElasticSearch中删除文档？ I read the Logstash documentation and found nothing and tried a few configs but nothing happened using action "delete" 我阅读了Logstash文档，却一无所获，并尝试了一些配置，但是使用操作“删除”却没有任何反应

output {
    elasticsearch{
        action => "delete"
        host => "localhost"
        index => "index_name"
        document_id => "%{id}"
    }
}

Has anyone tried this? 有人尝试过吗？ Is there anything special that I should add to the input and filter sections of the config? 我应该在配置的输入和过滤器部分添加一些特殊的东西吗？ I used file plugin for input and csv plugin for filter. 我使用文件插件作为输入，使用csv插件作为过滤器。

Answer 1

It is definitely possible to do what you suggest, but if you're using Logstash 1.5, you need to use the transport protocol as there is a bug in Logstash 1.5 when doing delete s over the HTTP protocol (see issue #195 ) 绝对可以按照您的建议去做，但是如果您使用的是Logstash 1.5，则需要使用transport协议，因为通过HTTP协议执行delete时Logstash 1.5中存在一个错误（请参见问题＃195 ）。

So if your delete.csv CSV file is formatted like this: 因此，如果您的delete.csv CSV文件格式如下：

And your delete.conf Logstash config looks like this: 您的delete.conf Logstash配置如下所示：

input {
    file {
        path => "/path/to/your/delete.csv"
        start_position => "beginning"
        sincedb_path => "/dev/null"
    }
}
filter {
    csv {
        columns => ["id"]
    }
}
output {
    elasticsearch{
        action => "delete"
        host => "localhost"
        port => 9300                         <--- make sure you have this
        protocol => "transport"              <--- make sure you have this
        index => "your_index"                <--- replace this
        document_type => "your_doc_type"     <--- replace this
        document_id => "%{id}"
    }
}

Then when running bin/logstash -f delete.conf you'll be able to delete all the documents whose id is specified in your CSV file. 然后，在运行bin/logstash -f delete.conf您将能够删除ID在CSV文件中指定的所有文档。

Answer 2

In addition to Val's answer, I would add that if you have a single input that has a mix of deleted and upserted rows, you can do both if you have a flag that identifies the ones to delete. 除了Val的答案外，我还要补充一点：如果您有一个包含删除行和升序行混合输入的单个输入，那么如果您有一个标识要删除的行的标志，则可以同时执行这两个操作。 The output > elasticsearch > action parameter can be a "field reference," meaning that you can reference a per-row field. output > elasticsearch > action参数可以是“字段引用”，这意味着您可以引用每行字段。 Even better, you can change that field to a metadata field so that it can be used in a field reference without being indexed. 更好的是，您可以将该字段更改为元数据字段，这样它就可以在字段引用中使用而无需编制索引。

For example, in your filter section: 例如，在您的filter部分中：

filter {
    # [deleted] is the name of your field
    if [deleted] {
        mutate {    
            add_field => {
                "[@metadata][elasticsearch_action]" => "delete"
            }
        }
        mutate {
            remove_field => [ "deleted" ]
        }
    } else {
        mutate {    
            add_field => {
                "[@metadata][elasticsearch_action]" => "index"
            }
        }
        mutate {
            remove_field => [ "deleted" ]
        }
    }   
}

Then, in your output section, reference the metadata field: 然后，在输出部分中，引用元数据字段：

output {
    elasticsearch {
        hosts => "localhost:9200"
        index => "myindex"
        action => "%{[@metadata][elasticsearch_action]}"
        document_type => "mytype"
    }
}

elasticsearch使用logstash和csv删除文档

问题描述

2 个解决方案

解决方案1
1 已采纳 2015-10-02 03:46:32

解决方案2
0 2017-05-10 14:58:08

elasticsearch使用logstash和csv删除文档

问题描述

2 个解决方案

解决方案1 1 已采纳 2015-10-02 03:46:32

解决方案2 0 2017-05-10 14:58:08

解决方案1
1 已采纳 2015-10-02 03:46:32

解决方案2
0 2017-05-10 14:58:08