[英]Logstash - import nested JSON into Elasticsearch
I have a large amount (~40000) of nested JSON objects I want to insert into elasticsearch an index.我有大量(~40000)嵌套的 JSON 对象,我想将索引插入到 elasticsearch 中。
The JSON objects are structured like this: JSON 对象的结构如下:
{
"customerid": "10932"
"date": "16.08.2006",
"bez": "xyz",
"birthdate": "21.05.1990",
"clientid": "2",
"address": [
{
"addressid": "1",
"tile": "Mr",
"street": "main str",
"valid_to": "21.05.1990",
"valid_from": "21.05.1990",
},
{
"addressid": "2",
"title": "Mr",
"street": "melrose place",
"valid_to": "21.05.1990",
"valid_from": "21.05.1990",
}
]
}
So a JSON field (address in this example) can have an array of JSON objects.所以一个 JSON 字段(本例中的地址)可以有一个 JSON 对象数组。
What would a logstash config look like to import JSON files/objects like this into elasticsearch?将这样的 JSON 文件/对象导入 elasticsearch 时,logstash 配置会是什么样的? The elasticsearch mapping for this index should just look like the structure of the JSON.
此索引的 elasticsearch 映射应该看起来像 JSON 的结构。 The elasticsearch document id should be set to
customerid
. elasticsearch 文档 id 应设置为
customerid
。
input {
stdin {
id => "JSON_TEST"
}
}
filter {
json{
source => "customerid"
....
....
}
}
output {
stdout{}
elasticsearch {
hosts => "https://localhost:9200/"
index => "customers"
document_id => "%{customerid}"
}
}
If you have control of what's being generated, the easiest thing to do is to format you input as single line json and then use the json_lines
codec.如果您可以控制生成的内容,最简单的方法是将您的输入格式化为单行 json,然后使用
json_lines
编解码器。
Just change your stdin
to:只需将您的
stdin
更改为:
stdin { codec => "json_lines" }
and then it'll just work:然后它就会起作用:
cat input_file.json | logstash -f json_input.conf
where input_file.json has lines like:其中 input_file.json 有如下几行:
{"customerid":1,"nested": {"json":"here"}}
{"customerid":2,"nested": {"json":"there"}}
and then you won't need the json
filter.然后你就不需要
json
过滤器了。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.