简体   繁体   English

在 ElasticSearch 中搜索时处理数组索引

[英]Handling Array Indexes When Searching In ElasticSearch

I have a problem with my ES queries where they fail because of having array indexes in their query strings.我的 ES 查询有问题,因为它们的查询字符串中有数组索引而失败。 This happens because my following approach.发生这种情况是因为我的以下方法。 I flatten the JSON requests that I get with the following method.我将使用以下方法获得的 JSON 请求展平。

private void flattenJsonRequestToMap(String currentPath, JsonNode jsonNode, Map<String, Object> map) {
        if (jsonNode == null || jsonNode.isNull()) {
            map.remove(currentPath);
        } else if (jsonNode.isObject()) {
            ObjectNode objectNode = (ObjectNode) jsonNode;
            Iterator<Map.Entry<String, JsonNode>> iter = objectNode.fields();
            String pathPrefix = currentPath.isEmpty() ? "" : currentPath + ".";

            while (iter.hasNext()) {
                Map.Entry<String, JsonNode> entry = iter.next();
                flattenJsonRequestToMap(pathPrefix + entry.getKey(), entry.getValue(), map);
            }
        } else if (jsonNode.isArray()) {
            ArrayNode arrayNode = (ArrayNode) jsonNode;
            for (int i = 0; i < arrayNode.size(); i++) {
                flattenJsonRequestToMap(currentPath + "[" + i + "]", arrayNode.get(i), map);
            }
        } else if (jsonNode.isValueNode()) {
            ValueNode valueNode = (ValueNode) jsonNode;
            map.put(currentPath, valueNode.asText());
        } else {
            LOGGER.error("JSONNNode unexpected field found during the flattening of JSON request" + jsonNode.asText());
        }
    }

When the Json requests have lists in them, my flattened map looks like below.当 Json 请求中有列表时,我的扁平化 map 如下所示。

myUserGuid -> user_testuser34_ibzwlm
numberOfOpenings -> 1
managerUserGuids[0] -> test-userYspgF1_S3P6s
accessCategories[0] -> RESTRICTED
employeeUserGuid -> user_user33_m1minh

Now I construct ES Query with the following method using the above map.现在我使用上面的 map 用下面的方法构造 ES Query。

public SearchResponse searchForExactDocument(final String indexName, final Map<String, Object> queryMap)
            throws IOException {
        BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
        queryMap.forEach((name, value) -> {
            queryBuilder.must(QueryBuilders.matchPhraseQuery(name, value));
            LOGGER.info("QueryMap key: {} and value: {} ", name, value);
        });
        return this.executeSearch(indexName, queryBuilder);
    }

As you can already see, it ends up executing the query below, with the array indexes in them.正如您已经看到的,它最终执行了下面的查询,其中包含数组索引。 My mapping structure is as follows.我的映射结构如下。

{
  name=job,
  type=_doc,
  mappingData={
    properties={
    
      myUserGuid ={
        type=text,
        fields={
          keyword={
            ignore_above=256,
            type=keyword
          }
        }
      },
      numberOfOpenings ={
        type=long
      },
      numOfUsage={
        type=long
      },
      accessCategories ={
        type=text,
        fields={
          keyword={
            ignore_above=256,
            type=keyword
          }
        }
      },
      managerUserGuids ={
        type=text,
        fields={
          keyword={
            ignore_above=256,
            type=keyword
          }
        }
      },
      employeeUserGuid ={
        type=text,
        fields={
          keyword={
            ignore_above=256,
            type=keyword
          }
        }
      }
  }
}

Because of the appended array index next to the name, the queries don't return any search results.由于名称旁边附加了数组索引,因此查询不会返回任何搜索结果。 How can I navigate this issue?我该如何解决这个问题? One option I see is removing the array index using flattening the map, however I need to be able to construct a POJO object which has list for those fields in concern, using the flattened map.我看到的一个选项是使用展平 map 来删除数组索引,但是我需要能够构造一个 POJO object,其中包含相关字段的列表,使用展平的 Z1D78DC8ED51214E518B5114AE24490。 Would appreciate any advice/suggestions.将不胜感激任何意见/建议。 Thanks a lot in advance.提前非常感谢。

Lists in ES are processed like just having few values for one field so if you have "accessCategories": ["foo", "bar"] this doc will match both "accessCategories": "foo" and "accessCategories": "bar" though there is no way to make a query which would match only one ("foo" but not "bar") with this data schema. ES 中的列表被处理为一个字段只有几个值,所以如果你有"accessCategories": ["foo", "bar"]这个文档将匹配 " accessCategories "accessCategories": "foo""accessCategories": "bar"尽管没有办法进行只匹配一个(“foo”而不是“bar”)与此数据模式的查询。

If you need to address specific items, you can unwrap list into separate fields accessCategories_0 , accessCategories_1 , etc. though there is a limit in Elasticsearch for total number of fields in one index.如果您需要处理特定项目,您可以将列表展开为单独的字段accessCategories_0accessCategories_1等,尽管 Elasticsearch 对一个索引中的字段总数有限制。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM