简体   繁体   中英

New elasticsearch-java API `CreateIndexRequest` using `.withJson` causes `co.elastic.clients.util.MissingRequiredPropertyException`

I am having a hard time on using the new elasticsearch-java api client.

I am migrating from HLRC to the new elasticsearch java api.

When I create an index, I used CreateIndexRequest and load it with json source

But why is it resulting to an exception, seems that there are missing required properties

Are all properties needed to be in the json file? But why is it when I used Kibana, the json file works even I just put the needed properties?

Also in the deprecated HLRC client, the json works when using its CreateIndexRequest.

Also this is in a springboot environment

I used the following

  • org.elasticsearch: elasticsearch 7.17.4
  • co.elastic.clients: elasticsearch-java 7.17.4
  • jakarta.json:jakarta.json-api 2.0.1
  • com.fasterxml.kackson: jackson-databind 2.13.3

Below is the exception

     co.elastic.clients.json.JsonpMappingException: Error deserializing 
    co.elastic.clients.elasticsearch._types.analysis.TokenizerDefinition: 
    co.elastic.clients.util.MissingRequiredPropertyException: Missing required property 
    'PathHierarchyTokenizer.bufferSize' (JSON path: 
    settings.analysis.tokenizer.unix_path_tokenizer) (line no=15, column no=10, offset=377)

Below is my code

    final String assetJsonSource = "./config/elasticsearch/my_index_settings.json";
    try (InputStream input = new FileInputStream(assetJsonSource)) {
      CreateIndexRequest request =
          CreateIndexRequest.of(builder -> builder.index(indexName).withJson(input));
      CreateIndexResponse response = client2.indices().create(request);
      boolean ack = Boolean.TRUE.equals(response.acknowledged());
    } catch (IOException e) {
      log.error("Failed to create an index", e);
    }

The json I used is:

{
  "settings": {
    "number_of_shards": 5,
    "max_ngram_diff": 2,
    "analysis": {
      "tokenizer": {
        "ngram_tokenizer": {
          "type": "ngram",
          "min_gram": "1",
          "max_gram": "3",
          "token_chars": ["letter", "digit", "punctuation", "symbol"]
        },
        "unix_path_tokenizer": {
          "type": "path_hierarchy"
        },
        "whitespace_tokenizer": {
          "type": "whitespace"
        },
        "keyword_tokenizer": {
          "type": "keyword"
        }
      },
      "analyzer": {
        "ngram_analyzer": {
          "tokenizer": "ngram_tokenizer",
          "char_filter": ["icu_normalizer"],
          "filter": ["lowercase"]
        },
        "lowercase_analyzer": {
          "tokenizer": "keyword",
          "filter": ["lowercase"]
        },
        "directory_path_analyzer": {
          "tokenizer": "unix_path_tokenizer"
        },
        "whitespace_analyzer": {
          "tokenizer": "whitespace_tokenizer"
        },
        "keyword_analyzer": {
          "tokenizer": "keyword_tokenizer"
        }
      },
      "normalizer": {
        "lowercase_normalizer": {
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "ngram_analyzer",
        "fields": {
          "lowercase": {
            "type": "keyword",
            "normalizer": "lowercase_normalizer"
          }
        }
      },
      "path": {
        "type": "text",
        "analyzer": "directory_path_analyzer",
        "fields": {
          "full": {
            "type": "keyword"
          }
        }
      },
      "originalSize": {
        "type": "double",
        "store": "true"
      },
      "assetCategory": {
        "type": "text",
        "analyzer": "keyword_analyzer",
        "search_analyzer": "whitespace_analyzer",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      },
      "mimetype": {
        "type": "keyword"
      },
      "importedBy": {
        "type": "integer",
        "store": "true"
      },
      "updatedBy": {
        "type": "keyword"
      },
      "importedAt": {
        "type": "date",
        "store": "true"
      },
      "updatedAt": {
        "type": "date",
        "store": "true"
      },
      "fileCreatedAt": {
        "type": "date",
        "store": "true"
      },
      "fileUpdatedAt": {
        "type": "date",
        "store": "true"
      },
      "metadataSet": {
        "type": "long"
      },
      "instanceId": {
        "type": "keyword"
      },
      "referenceId": {
        "type": "keyword"
      },
      "cutComment": {
        "type": "text",
        "analyzer": "ngram_analyzer",
        "fields": {
          "lowercase": {
            "type": "text",
            "analyzer": "lowercase_analyzer"
          }
        }
      },
      "comment": {
        "properties": {
          "userId": {
            "type": "long"
          },
          "value": {
            "type": "text",
            "analyzer": "ngram_analyzer",
            "fields": {
              "lowercase": {
                "type": "text",
                "analyzer": "lowercase_analyzer"
              }
            }
          },
          "updatedAt": {
            "type": "date"
          }
        }
      },
      "content": {
        "type": "text",
        "analyzer": "ngram_analyzer"
      },
      "shadow": {
        "type": "boolean",
        "store": "true"
      },
      "shadowUpdatedAt": {
        "type": "date",
        "store": "true"
      },
      "downloadValue": {
        "type": "long"
      },
      "collection": {
        "type": "long"
      },
      "sha1": {
        "type": "keyword"
      },
      "subtitle": {
        "type": "text",
        "analyzer": "ngram_analyzer"
      },
      "videoOcr": {
        "type": "text",
        "analyzer": "ngram_analyzer"
      },
      "version": {
        "type": "long"
      }
    },
    "dynamic_templates": [
      {
        "cmeta_str": {
          "match": "cmeta_str-*",
          "mapping": {
            "type": "text",
            "store": "true",
            "analyzer": "ngram_analyzer",
            "fields": {
              "lowercase": {
                "type": "keyword",
                "normalizer": "lowercase_normalizer"
              }
            }
          }
        }
      },
      {
        "cmeta_select": {
          "match": "cmeta_select-*",
          "mapping": {
            "type": "text",
            "store": "true",
            "analyzer": "ngram_analyzer",
            "fields": {
              "lowercase": {
                "type": "keyword",
                "normalizer": "lowercase_normalizer"
              }
            }
          }
        }
      },
      {
        "cmeta_bool": {
          "match": "cmeta_bool-*",
          "mapping": {
            "type": "boolean",
            "store": "true"
          }
        }
      },
      {
        "cmeta_double": {
          "match": "cmeta_double-*",
          "mapping": {
            "type": "double",
            "store": "true"
          }
        }
      },
      {
        "cmeta_date": {
          "match": "cmeta_date-*",
          "mapping": {
            "type": "date",
            "store": "true"
          }
        }
      },
      {
        "cmeta_multi_label": {
          "match": "cmeta_multi_label-*",
          "mapping": {
            "type": "long",
            "store": "true"
          }
        }
      }
    ]
  }
}

Seems that the new client requires explicit values for the path hierarchy tokenizer even if they are default. (see default values here )

You need to add default values to:

"unix_path_tokenizer": {
   "type": "path_hierarchy"
}

becomes:

"unix_path_tokenizer": {
   "type": "path_hierarchy"
   "delimiter": "/",
   "buffer_size": "1024",
   "replacement": "/",
   "reverse": "false",
   "skip": "0"
}

If you miss any default value, the error will change to that missing property.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM