I'm new regarding Logstash, currently I'm trying to read files from S3 (every new line of file is a new json) and parse JSON fields to send only part to ES.
It's amazing how Logstash is supporting me with this, as until now everything was smooth:
input { s3 { ... } }
I didn't even need to explicitly say that files are GZiped, or that codec is JSON, which still surprises me, how Logstash is resolving that.
But...now If I give immediately:
output { elasticsearch { ... } }
then all my JSON body lands in a "message" string inside ElasticSearch. So I did this:
filter { json { source => "message" } }
After that I see that every child from my JSON is parsed as separated value in ES - this is perfect, but what if I want send to ES only 2, or 3 children from the JSON?
My example structure in JSON:
{"path":"/h/asia","headers":{"x-forwarded-for":"1.1.1.1","user-agent":"Mozilla/5.0"},"params":{"filters_values":"test","pagecount":"2","user_status":"unlogged"},"meta":{"date":1538974058,"acceptCookies":true}}
So at the end I'm landing in ES with fields like:
"path.headers.x-forwarded-for",
"params.pagecount",
"params.user_status" etc.
Where my aim is to store in ES only two like "params.filters_values" and "headers.user_agent".
Thanks in advance
You can use the prune
filter to pick the fields you want:
filter {
prune {
whitelist_names => [ "params", "headers" ]
}
}
However, this has the limitation that you can only do it on top level fields so not quite what you want.
https://www.elastic.co/guide/en/logstash/current/plugins-filters-prune.html
Use the remove_field
in json filter
filter {
json {
source => "message"
remove_field => [ "path.headers.x-forwarded-for", "params.pagecount", .. ]
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.