简体   繁体   中英

Get count of documents missing a field over a date histogram in ElasticSearch?

I'm trying to find out the number of documents that don't contain a certain field grouped daily.

The idea being that I can work out the daily response rate statistic.

I'm using PHP but can happily convert a JSON query to a suitable nested array.

Here's what I have so far.

$params['aggs'] = [
                "daily"=> [
                    "date_histogram"=> [
                        "field" => "date_created",
                        "interval" => "1d",
                        "min_doc_count" => 0
                    ],
                    "aggs"=>[
                        "unresponded"=>[
                            "missing"=>[
                                "field"=> "responses"
                            ]
                        ]
                    ]
                ]
            ];

This returns data, with an unresponded bucket for each daily bucket as expected, however the values don't tally up with the data. Instead every document that is in the daily bucket is accounted for in the unresponded bucket regardless of whether documents from that particular day have a response field or not.

Looks like the missing aggregation doesn't work for existing but empty array fields. I had to rework the aggregation to make use of a boolean must-not exist.

"aggs"=>[
    "responded"=>[
          "filter"=>[
               "query"=>[
                   "bool"=>[
                       "must_not"=>[
                           "exists"=>[
                               "field"=>"responses"
                            ]
                        ]
                    ]
                ]
           ]
      ]
]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM