简体   繁体   中英

Loading csv in ElasticSearch using logstash

I have a csv in which one column may contain multi-line values.

ID,Name,Address
1, ABC, "Line 1
Line 2
Line 3"

The data written above as per CSV standard is one record (to my knowledge).

I have following filter for logstash

filter {
  csv {
      separator => ","
      quote_char => "\""
     columns => ["ID","Name", "Address"]
  }
}
output {
   elasticsearch {
     host => "localhost"
     port => "9200"
     index => "TestData"
     protocol => "http"
  }
stdout {}
}

But when I execute it, it creates three records. (All are wrong in principle as first one contains two column data ID and Name and partial data for Address and next two records contain Line 2 and Line 3 but no ID and Name

How can I fix this? Am I missing something in the file parsing?

have you tryed the multiline codec ?

You should add something like this in your input plugin:

codec => multiline {
      pattern => "^[0-9]"
      negate => "true"
      what => "previous"
    }

it tells logstash that every line not starting with a number should be merged with the previous line

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM