简体   繁体   中英

Elasticsearch High Level Rest Client Java sorting not working properly

I am pretty new with Elasticsearch High Level Rest Client (Java).

I have a very simple query which lists all records, but it seems sorting is not working properly.

Some of the fields are text types, so I needed to set fielddata to true.

Update:

Thanks Andrei for the solution.

I need to add another field which is an array of objects user.groups

I added .keyword to the mapping fields (text) which are sortable.

But I am getting unexpected results.

Example:

...
"groups": [
    {"name", "ECPay", ... },
    {"name", "Abangers", ... }
]
...

Based from the output (below), if there are two group items ECPay and Abangers ... using user.groups.name.keyword sorting will consider Abangers.

I would like to consider the first element of user.groups array as a basis of the sort.

Based from the example (above), tt will base the sort with ECPay since it is the first element of the array.

To understand the problem (updated), let us check the search result (sort) below.

Search Result (Output):

{
    "took": 18,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 74,
        "max_score": null,
        "hits": [
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "5",
                "_score": null,
                "_source": {
                    "name": "Ericsson Joseph Sultan Atutuli",
                    "country": "",
                    "uuid": "5",
                    "userId": 5,
                    "email": "ejsultanatutuli@gmail.com",
                    "deletedInd": false,
                    "groups": [
                        {
                            "name": "ECPay",
                            "id": 2
                        },
                        {
                            "name": "Abangers",
                            "id": 4
                        }
                    ],
                    "company": ""
                },
                "sort": [
                    "Abangers"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "54",
                "_score": null,
                "_source": {
                    "name": "Florentina Atutuli",
                    "country": null,
                    "uuid": "54",
                    "userId": 54,
                    "email": "florentina.atutuli@gmail.com",
                    "deletedInd": false,
                    "groups": [
                        {
                            "name": "Abangers",
                            "id": 4
                        },
                        {
                            "name": "Test Group",
                            "id": 5
                        }
                    ],
                    "company": null
                },
                "sort": [
                    "Abangers"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "37",
                "_score": null,
                "_source": {
                    "name": "dsfsdfsdf",
                    "country": null,
                    "uuid": "37",
                    "userId": 37,
                    "email": "asdf@sdf.com",
                    "deletedInd": false,
                    "groups": [
                        {
                            "name": "Abangers",
                            "id": 4
                        },
                        {
                            "name": "Test Group",
                            "id": 5
                        }
                    ],
                    "company": null
                },
                "sort": [
                    "Abangers"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "3",
                "_score": null,
                "_source": {
                    "name": "Erick Atutuli",
                    "country": "Philippines",
                    "email": "erickatutuli@pakyas.com",
                    "userId": 3,
                    "uuid": "d8f4ab43-d33e-4a82-a08b-eb73342a0546",
                    "groups": [
                        {
                            "name": "ECPay",
                            "id": 2
                        },
                        {
                            "name": "Abangers",
                            "id": 4
                        }
                    ],
                    "deletedInd": false,
                    "company": "Hotlegs Incorporated"
                },
                "sort": [
                    "Abangers"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "34",
                "_score": null,
                "_source": {
                    "name": "Chun-Li",
                    "country": null,
                    "email": "chunli@pakyas.com",
                    "uuid": "34",
                    "userId": 34,
                    "deletedInd": false,
                    "groups": [
                        {
                            "name": "Customers AU",
                            "id": 1
                        }
                    ],
                    "company": null
                },
                "sort": [
                    "Customers AU"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "57",
                "_score": null,
                "_source": {
                    "name": "Eddy Bear",
                    "country": "US",
                    "email": "eddybear@pakyas.com",
                    "uuid": "57",
                    "userId": 57,
                    "deletedInd": false,
                    "groups": [
                        {
                            "name": "Customers AU",
                            "id": 1
                        }
                    ],
                    "company": "Jollibee"
                },
                "sort": [
                    "Customers AU"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "42",
                "_score": null,
                "_source": {
                    "name": "Alfredo Pitik Mingaw",
                    "country": "",
                    "email": "akomykel@gmail.com",
                    "userId": 42,
                    "uuid": "42",
                    "deletedInd": false,
                    "groups": [
                        {
                            "name": "ECPay",
                            "id": 2
                        }
                    ],
                    "company": ""
                },
                "sort": [
                    "ECPay"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "20",
                "_score": null,
                "_source": {
                    "name": "test",
                    "country": "Åland Islands",
                    "userId": 20,
                    "email": "test102@email.com",
                    "uuid": "20",
                    "groups": [
                        {
                            "name": "ECPay",
                            "id": 2
                        }
                    ],
                    "deletedInd": false,
                    "company": "test"
                },
                "sort": [
                    "ECPay"
                ]
            },

            ...

        ]
    }
}

Now I added keyword field in the user mapping:

...
"groups": {
    "properties": {
        "name": {
            "type": "text",
            "fields": {
                "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                }
            }
        }
    }
},
...

Make text fields sortable, removed fielddata and added keywords

http://localhost:9200/acme_users/_mapping/user  

{
  "properties": {
        "name": {
            "type": "text",
            "fields": {
                "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                }
            }
        },
        "company": {
            "type": "text",
            "fielddata": true
        },
        "country": {
            "type": "text",
            "fielddata": true
        },
        "groups": {
            "properties": {
                "name": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                }
            }
        },
        "email": {
            "type": "text",
            "fields": {
                "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                }
            }
        }
  }
}

The query below is generated from my Java application (Spring).
This is a very simple case wherein I just sort the results into name ascending order.

Query

Added .keyword to the sorting field. In the sample below... it will sort using groups.name.keyword which was originally groups.name

http://localhost:9200/acme_users/user/_search  

{
  "from" : 0,
  "size" : 15,
  "query" : {
    "match_all" : {
      "boost" : 1.0
    }
  },
  "sort" : [
    {
      "groups.name.keyword" : {
        "order" : "asc"
      }
    }
  ]
}

Original Problem:

But my problem now is, it seems ES sort does not work properly. The results changes when I change the order from asc to desc (vice versa).
Though the results changes, it seems that the names (result) is not sorting properly (az) or (za).
Seems that it is getting the last part of name and using it as the basis for the sort. I would like to base it on the first character value of name .

Original Search Result:

{
    "took": 4,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 73,
        "max_score": null,
        "hits": [
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "22",
                "_score": null,
                "_source": {
                    "name": "Popeye Partner 01",
                    "country": null,
                    "userId": 22,
                    "email": "popeye_partner_01@kugmo.com",
                    "uuid": "22",
                    "deletedInd": false,
                    "company": null
                },
                "sort": [
                    "01"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "24",
                "_score": null,
                "_source": {
                    "name": "Dummy User 01",
                    "country": null,
                    "userId": 24,
                    "email": "dummy@dummy.com",
                    "uuid": "24",
                    "deletedInd": false,
                    "company": null
                },
                "sort": [
                    "01"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "23",
                "_score": null,
                "_source": {
                    "name": "Popeye Partner 02",
                    "country": null,
                    "userId": 23,
                    "email": "popeye_partner_02@kugmo.com",
                    "uuid": "23",
                    "deletedInd": false,
                    "company": null
                },
                "sort": [
                    "02"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "25",
                "_score": null,
                "_source": {
                    "name": "RT Administrator",
                    "country": null,
                    "userId": 25,
                    "email": "rt_administrator@kugmo.com",
                    "uuid": "25",
                    "deletedInd": false,
                    "company": null
                },
                "sort": [
                    "administrator"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "48",
                "_score": null,
                "_source": {
                    "name": "John Patrick Abnoy",
                    "country": null,
                    "userId": 48,
                    "email": "patrickabnoy@gmail.com",
                    "uuid": "48",
                    "deletedInd": false,
                    "company": null
                },
                "sort": [
                    "abnoy"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "5",
                "_score": null,
                "_source": {
                    "name": "Ericsson John Santol Atutuli",
                    "country": "",
                    "uuid": "5",
                    "userId": 5,
                    "email": "ejsantolatutuli@gmail.com",
                    "deletedInd": false,
                    "company": ""
                },
                "sort": [
                    "atutuli"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "54",
                "_score": null,
                "_source": {
                    "name": "Florentina Atutuli",
                    "country": null,
                    "uuid": "54",
                    "userId": 54,
                    "email": "florentina.atutuli@gmail.com",
                    "deletedInd": false,
                    "company": null
                },
                "sort": [
                    "atutuli"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "49",
                "_score": null,
                "_source": {
                    "name": "Laarnie Santol Atutuli",
                    "country": "",
                    "email": "lmsantolatutuli@gmail.com",
                    "userId": 49,
                    "uuid": "49",
                    "deletedInd": false,
                    "company": ""
                },
                "sort": [
                    "atutuli"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "3",
                "_score": null,
                "_source": {
                    "name": "Eric Atutuli",
                    "country": "Philippines",
                    "uuid": "d8f4ab43-d33e-4a82-a08b-eb73342a0546",
                    "userId": 3,
                    "email": "ericatutuli@kugmo.com",
                    "deletedInd": false,
                    "company": "Hotlegs Incorporated"
                },
                "sort": [
                    "atutuli"
                ]
            },
            {
                "_index": "acme_users",
                "_type": "user",
                "_id": "29",
                "_score": null,
                "_source": {
                    "name": "Auberto Matulis",
                    "country": null,
                    "userId": 29,
                    "email": "bert.matulis@gmail.com",
                    "uuid": "29",
                    "deletedInd": false,
                    "company": null
                },
                "sort": [
                    "auberto"
                ]
            }
        ]
    }
}

Thanks!

name is a text field which means it is analyzed. Analyzed means it's split into tokens. If Popeye Partner 01 is split into popeye , partner , 01 which field you want to use for sorting? Probably none of them, since you want the sort to happen on the original text. For this to happen, add a sub-field to your name field

{
  "name": {
    "type" "text",
    "fields": {
      "keyword": {
        "type": "keyword",
        "ignore_above": 256
      }
    }
  }
}

and make it keyword then in your query use that for sorting:

"sort" : [  
    {  
      "name.keyword" : {  
        "order" : "asc"  
      }  
    }  
  ]

We ran into the same issue and then we looked at the mapping which we created at the time of indexCreation and we observed a couple of things.

  1. All the fields which always have one word as the value should be changed to keyword so we dont need to append .keyword in all the fields names(dont want to mantain schema on client side to append .keyword) or add "fielddata": true ( avoid this as much as possible, because it takes up your heap mem) "type":"keyword"
  2. All the values which are defined as text but actually they are number, so we changed to either long or scaled_float. This helped us in sorting and aggregation.

"type":"long"

OR

"type": "scaled_float", "scaling_factor": 100

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM