简体   繁体   English

嵌套可枚举对象(i18n)的ElasticSearch映射

[英]ElasticSearch mapping for nested enumerable objects (i18n)

I'm at a loss as to how to map a document for search with the following structure: 我不知道如何映射具有以下结构的文档以进行搜索:

{
  "_id": "007ff234cb2248",
  "ids": {
    "source1": "123",
    "source2": "456",
    "source3": "789"
  }
  "names": [
    {"en":"Example"}, 
    {"fr":"exemple"}, 
    {"es":"ejemplo"},
    {"de":"Beispiel"}
  ],
  "children" : [
    {
      "ids": {
        "source1": "CXXIII",
        "source2": "CDLVI",
        "source3": "DCCLXXXIX",
      }
      names: [
        {"en":"Example Child"}, 
        {"fr":"exemple enfant"}, 
        {"es":"Ejemplo niño"},
        {"de":"Beispiel Kindes"}
      ]
    }
  ],
  "relatives": {
    // Typically no "ids" at this level.
    "relation": 'uncle',
    "children": [
      {
        "ids": {
          "source1": "0x7B",
          "source2": "0x1C8",
          "source3": "0x315"
        },
        "names": [
          {"en":"Example Cousin"}, 
          {"fr":"exemple cousine"}, 
          {"es":"Ejemplo primo"},
          {"de":"Beispiel Cousin"}
        ]
      }
    ]
  }
}

The child object may appear in the children section directly, or further nested in my document as uncle.children (cousins, in this case). child对象可以直接出现在children部分中,也可以嵌套在我的文档中,作为uncle.children (在这种情况下是表亲)。 The IDs field is common to levels one (the root), level two (the children and the uncle), and to level three (the cousins), the naming structure is also common to levels one and three. ID字段对于第一级(根),第二级(孩子和叔叔)和第三级(堂兄弟)是通用的,命名结构对于第一级和第三级也是通用的。

My use-case is to be able to search for IDs (nested objects) by prefix, and by the whole ID. 我的用例是能够按前缀和整个ID搜索ID(嵌套对象)。 And also to be able to search for child names, following an ( as yet undefined ) set of analyzer rules. 并且还能够按照一组( 尚未定义的 )分析器规则来搜索子名称。

I haven't been able to find a way to map these in any useful way. 我还没有找到一种以任何有用的方式来映射它们的方法。 I don't believe I'll have much success using the same technique for ids and names , as there's an extra level of mapping between names and the document root. 我认为使用相同的idsnames技术不会取得太大成功,因为名称和文档根目录之间存在额外的映射关系。

I'm not even certain that it is even mappable . 我什至不确定它甚至是可映射的 I believe at least in principle that the ids should be mappable as terms, and perhaps that if I index the names as terms in some way, too. 我至少在原则上认为ids应该作为术语可映射,也许我也可以通过某种方式将names作为术语索引。

I'm simply at a loss, and the documentation doesn't seem to cover anything like this level of complex mapping. 我只是一头雾水,文档似乎并没有涵盖这种级别的复杂映射。

I have limited (read: no) control of the document as it's coming from the CouchDB river, and the upstream application already relies on this format, so I can't really change it. 由于来自CouchDB河的文档的控制有限(读:否),并且上游应用程序已经依赖于此格式,因此我无法真正更改它。

I'm looking for being able to search by the following pseudo conditions, all of which should match: 我正在寻找能够通过以下伪条件进行搜索的条件,所有伪条件都应匹配:

  • ID: "123" ID: "123"
  • ID by source (I don't know how best to mark this up in pseudo language) 按来源显示的ID(我不知道如何最好地用伪语言对此进行标记)
  • ID prefix: "CDL" ID前缀: "CDL"
  • Name: "Example" , "Example Child" 名称: "Example""Example Child"
  • Localized name (I don't even know how best to pseudo-mark this up! 本地化名称(我什至不知道如何最好地对其进行伪标记!

The specifics of tokenising and analysis I can figure out for myself, when I at least know how to map 当我至少知道如何映射时,我可以自己弄清楚标记和分析的细节

  • Objects when both the key and the value of the object properties are important 当对象的键和值都重要时的对象
  • Enumerable objects when the key and value are important. 键和值很重要时可枚举的对象。

If the mapping from an ID to its children is 1-to-many, then you could store the children's names in a child field, as a field can have multiple values. 如果从ID到其子代的映射是一对多,则可以将子代的名称存储在子代字段中,因为一个字段可以具有多个值。 Each document would then have an ID field, possibly a relation field, and zero or more child fields. 每个文档将具有一个ID字段(可能是一个关系字段)以及零个或多个子字段。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM