简体   繁体   中英

elasticsearch nested document query

I'm new to elasticsearch, well have some idea on how to go about doing filters, queries, and aggregation but not sure how to solve this following problem below. I'm want to be able to query only the most recent deliveries (date and crate_quantity) of companies from the document shown below. I'm not sure how to go about doing it. Is there a way to use max aggregation to pull only the most recent deliveries from each document?

POST /sanfrancisco/devlivery
{
"company1": {
    "delivery": [
        {
            "date": "01/01/2013",
            "crate_quantity": 5
        },
        {
            "date": "01/12/2013",
            "crate_quantity": 3
        },
        {
            "date": "01/24/2013",
            "crate_quantity": 2
        }
    ]
}
}

POST /sanfrancisco/devlivery
{
"company2": {
    "delivery": [
        {
            "date": "01/01/2015",
            "crate_quantity": 14
        },
        {
            "date": "12/31/2014",
            "crate_quantity": 20
        },
        {
            "date": "11/24/2014",
            "crate_quantity": 13
        }
    ]
}
}

If you want the latest delivery for one company at a time, I would probably set it up using a parent/child relationship. I used company as the parent and delivery as the child.

I also added a custom date format so that your dates will be sorted the way you're expecting.

I set up the index like this:

DELETE /test_index

PUT /test_index
{
   "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 0
   },
   "mappings": {
      "company": {
         "properties": {
            "name": {
               "type": "string",
               "index": "not_analyzed"
            }
         }
      },
      "delivery": {
         "_parent": {
            "type": "company"
         },
         "properties": {
            "crate_quantity": {
               "type": "long"
            },
            "date": {
               "type": "date",
               "format": "MM/dd/yyyy"
            }
         }
      }
   }
}

then indexed the documents using the bulk api :

PUT /test_index/_bulk
{"index": {"_index":"test_index", "_type":"company", "_id":1}}
{"name":"company1"}
{"index": {"_index":"test_index", "_type":"delivery", "_id":1, "_parent":1}}
{"date": "01/01/2013", "crate_quantity": 5}
{"index": {"_index":"test_index", "_type":"delivery", "_id":2, "_parent":1}}
{"date": "01/12/2013", "crate_quantity": 3}
{"index": {"_index":"test_index", "_type":"delivery", "_id":3, "_parent":1}}
{"date": "01/24/2013",  "crate_quantity": 2}
{"index": {"_index":"test_index", "_type":"company", "_id":2}}
{"name":"company2"}
{"index": {"_index":"test_index", "_type":"delivery", "_id":4, "_parent":2}}
{"date": "01/01/2015", "crate_quantity": 14}
{"index": {"_index":"test_index", "_type":"delivery", "_id":5, "_parent":2}}
{"date": "12/31/2014",  "crate_quantity": 20}
{"index": {"_index":"test_index", "_type":"delivery", "_id":6, "_parent":2}}
{"date": "11/24/2014",  "crate_quantity": 13 }

Now I can query for the latest delivery for a particular company by using a has_parent filter , sorting on date, and only accepting a single result, as follows:

POST /test_index/delivery/_search
{
   "size": 1,
   "sort": [
      {
         "date": {
            "order": "desc"
         }
      }
   ],
   "filter": {
      "has_parent": {
         "type": "company",
         "query": {
            "term": {
               "name": {
                  "value": "company1"
               }
            }
         }
      }
   }
}
...
{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": null,
      "hits": [
         {
            "_index": "test_index",
            "_type": "delivery",
            "_id": "3",
            "_score": null,
            "_source": {
               "date": "01/24/2013",
               "crate_quantity": 2
            },
            "sort": [
               1358985600000
            ]
         }
      ]
   }
}

Here is the code I used while experimenting with this:

http://sense.qbox.io/gist/c519b0654448c8b7b0c7c85d613f1e88c0ad1d19

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM