简体   繁体   English

将数据放入Elasticsearch时,如何处理有时具有不同结构的字段?

[英]When putting data into elasticsearch, how do you handle fields that sometimes have different structures?

I'm getting several mapping Exceptions when trying to insert data from my mongoDb into Elastic. 尝试将mongoDb中的数据插入Elastic时,出现多个映射异常。 After some investigative work, it seems that the error comes from the fact that I have a field in my db that is sometimes and array of strings, while other times an array of objects. 经过一些调查工作,似乎错误是由于我的数据库中有一个字段有时是字符串数组,而其他时候是对象数组。

Meaning, for some documents in mongo it will have this: 意思是,对于mongo中的某些文档,它将具有以下内容:

{"my_field" : ["one", "two"]

while others 而其他人

{"my_field": [{"key":"value", "key2":"value"}, {"key":"value", "key2":"value"}, ...]

I'm having a difficult time in pinning down how exactly this situation is handled in Elastic. 我很难确定在Elastic中如何正确处理这种情况。

You will need to massage the data before it is indexed so that it does conform to elasticsearch's rules. 您将需要对数据进行索引之前对其进行整理,以使其确实符合Elasticsearch的规则。 One approach is for my_field to be a nested document - for one document you might have 一种方法是使my_field成为嵌套文档-对于一个文档,您可能拥有

{"my_field": {"string_value": ["one", "two"]}}

and for another 而另一个

{"my_field": {"doc_value": {"key":"value", "key2":"value"}}}

This assumes that the values for key and key2 will always have the same type and that there is a small number of possible keys in this document. 假设key和key2的值将始终具有相同的类型,并且本文档中可能的键数很少。 If this document contains arbitrary data you might be better off indexing as 如果此文档包含任意数据,则最好将其编入索引,因为

{"my_field": [{"key": "key1", "string_value": "value"},
            {"key": "key2", "int_value": "123"}]}

As for how you massage, one option is to do this before you send the data to elasticsearch. 至于按摩方式,一种选择是在将数据发送到elasticsearch之前执行此操作。 The downside is that the the _source attribute will obviously contained the transformed data. 缺点是_source属性显然将包含转换后的数据。

Another approach is to send the data to elasticsearch as-is, but to have a transform in the mapping that elasticsearch will run to transform the data before indexing. 另一种方法是将数据发送到elasticsearch原样,但有一个转换在elasticsearch运行索引之前转换数据的映射。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM