简体   繁体   中英

Error: failed to parse date field [2021-H1] with format [strict_date_optional_time||epoch_millis]

I am trying to fetch dates from elastic search using unconventional date formats. My field in elastic search accepts these dates by having "ignore_malformed": true but when I try to fetch these dates I get this error. My date formats are: YYYY-Q[1-4], YYYY-H[1-2], YYYY-[EARLY|MID|LATE] Can anyone suggest how I can retrieve these dates/ranges?

A couple of things here. Before we start, let's clarify the terminology. Fetching is probably just fine — your issue has to do with ingesting (and subsequently parsing ) your custom date fields.

Let's start with the most important point.

There's no standardized date format for year quarters, halves, or "seasons".

As discussed in the thread Is there a standard date format that expresses Quarters? , different companies divide their years differently.

That said, Elasticsearch will only be able to parse the year component of your date strings. In order to enforce the three custom formats from your question, you could define the following mapping:

PUT my-index/
{
  "mappings": {
    "properties": {
      "time_field": {
        "type": "date",
        "format": "uuuu-['Q'][1][2][3][4]||uuuu-['H1']['H2']||uuuu-['EARLY']['MID']['LATE']"
      }
    }
  }
}

BTW there's more on custom date formats here .

After that, you could attempt to ingest the three variants:

POST my-index/_doc
{
  "id": "2021-Q2",
  "time_field": "2021-Q2"
}

POST my-index/_doc
{
  "id": "2021-H1",
  "time_field": "2021-H1"
}

POST my-index/_doc
{
  "id": "2021-LATE",
  "time_field": "2021-LATE"
}

In order to verify how the dates were parsed internally, you could utilize script fields like so:

GET my-index/_search?filter_path=hits.hits._source.id,hits.hits.fields
{
  "_source": "id", 
  "script_fields": {
    "exploring_time_field": {
      "script": {
        "lang": "painless",
        "source": """
          LocalDateTime.ofInstant(
             Instant.ofEpochMilli(doc['time_field'].value.toInstant().toEpochMilli()),
             ZoneId.of('UTC')
          ).format(DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss"))
        """
      }
    }
  }
}

which would return:

{
  "hits" : {
    "hits" : [
      {
        "_source" : {
          "id" : "2021-Q2"
        },
        "fields" : {
          "exploring_time_field" : [
            "2021-01-01 00:00:00"
          ]
        }
      },
      {
        "_source" : {
          "id" : "2021-H1"
        },
        "fields" : {
          "exploring_time_field" : [
            "2021-01-01 00:00:00"
          ]
        }
      },
      {
        "_source" : {
          "id" : "2021-LATE"
        },
        "fields" : {
          "exploring_time_field" : [
            "2021-01-01 00:00:00"
          ]
        }
      }
    ]
  }
}

As you can see, since only the year component was extracted, the rest of the date time components were assigned the default values for missing components ( Jan 1st 00:00:00 ).

This is certainly not what you're after and so your best bet is to parse the custom date strings before you ingest them into Elasticsearch.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM