简体   繁体   中英

Elasticsearch range query date format for java not returning all the results

I have an elasticsearch index that contains fields with a timestamp in the format " dd/MM/yyyy HH:mm:ss " and a customer name. I need to delete records added to the index before a certain timestamp. Using deletebyquery api for java I had the following code:

DeleteByQueryRequest request =
            new DeleteByQueryRequest(index);
    //request.setQuery(new TermQueryBuilder(customerKeywordField, customerName));
    BoolQueryBuilder query = QueryBuilders.boolQuery()
            .filter(QueryBuilders.termsQuery(customerKeywordField, customerName))
            .filter(QueryBuilders.rangeQuery("createdDate.keyword").lte(timestamp));
    request.setQuery(query);

    try {
        BulkByScrollResponse bulkResponse =
                restHighLevelClient.deleteByQuery(request, RequestOptions.DEFAULT);
        
        }
    } catch (Exception e) {
        //exception handling
    }

and this was working as intended, however now that the dates are " 01/10/2021 ", it no longer returns records for " 29/09/2021 " or the like as part of results, so I assume it is taking date format as " MM/dd/yyyy " instead

I tried setting format as

QueryBuilders.rangeQuery("createdDate.keyword").lte(timestamp).format("*dd/MM/yyyy HH:mm:ss*")

but that did not work either. I have verified that the timestamp being passed is in the appropriate format and the timestamp on the record is in the correct format too, so I am at a loss. If someone can help, I would appreciate it.

EDIT: Mapping

{
  "Index_x" : {
    "mappings" : {
      "properties" : {
        //other fields
        "createdDate" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "customer" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

Sample document:

{
  "_index": "Index_x",
  "_type": "_doc",
  "_id": "1632381612786",
  "_score": 1,
  "_source": {
    "customer": "customer1",
    "createdDate": "23/09/2021 12:49:44",
    //other fields
  },
  "fields": {
    
    "customer.keyword": [
      "customer1"
    
    "createdDate": [
      "23/09/2021 12:49:44"
    ],
    "createdDate.keyword": [
      "23/09/2021 12:49:44"
    ]
    "customer": [
      "customer1"
    ]//other fields
    
  }
}

The field createdDate.keyword is a keyword , not date as shown in your mapping (see mapping types doc ):

"createdDate" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }
    }

You should change the field type to date .

If you are using dynamic mapping , during indexing, the string field createdDate is checked to see whether its contents match any of the date patterns in order to add it as a new date field. It seems that your format is not supported by default.

To solve this, you could customize the dynamic_date_formats to support your own date format.

Note that range queries on keyword fields are considered expensive and will not be executed by default, see docs .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM