簡體   English   中英

還有其他方法可以針對JSON中的多個嵌套字段優化此Elasticsearch查詢

[英]Is there any another way to optimize this elasticsearch query for multiple nested fields in JSON

我是Elasticserach的新手。 以下是需要在其上運行彈性查詢的示例數據。 我正在嘗試獲取account_type為“信用卡”且source_name為“ SOMEVALUE”的那些文檔

{
"took" : 0,
"timed_out" : false,
"_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
},
"hits" : {
    "total" : {
    "value" : 1,
    "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
    {
        "_index" : "bureau_data",
        "_type" : "_doc",
        "_id" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
        "_score" : 1.0,
        "_source" : {
        "userid" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
        "raw_derived" : {
            "gender" : "MALE",
            "firstname" : "trsqlsz",
            "middlename" : "rgj",
            "lastname" : "ggksb",
            "mobilephone" : "2125954664",
            "dob" : "1988-06-28 00:00:00",
            "applications" : [
            {
                "applicationid" : "c7fb0147-22fd-4a5e-8851-98241de6aa50",
                "createdat" : "2019-06-07 19:28:54",
                "updatedat" : "2019-06-07 19:28:55",
                "source" : "4",
                "source_name" : "EXPERIAN",
                "applicationcreditreportid" : "b67f9180-9bb6-485c-9cfc-e7ccf9a70a69",
                "accounts" : [
                {
                    "applicationcreditreportaccountid" : "c5de28c4-cac9-4390-852a-96f143cb0b62",
                    "currentbalance" : 418288,
                    "institutionid" : "021d58b4-aba5-42c9-8d39-304a78d34aea",
                    "accounttypeid" : "5",
                    "institution_name" : "HDFC BANK",
                    "account_type_name" : "Personal Loan"
                }
                ]
            }
            ]
        }
        }
    }

我已經嘗試了以下查詢及其正常工作。 我需要我們是否有任何優化的方法來查詢多個嵌套字段

GET /my_index/_search
{
"query": {
    "bool": {
    "must": [
        {
        "nested": {
            "path": "raw_derived.applications.accounts",
            "query": {
            "bool": {
                "must": [
                {"match": {
                    "raw_derived.applications.accounts.account_type_name": "Credit Card"
                }}
                ]
            }
            }
        }
        },
        {
        "nested": {
            "path": "raw_derived.applications",
            "query": {
            "bool": {
                "must": [
                {"match": {
                    "raw_derived.applications.source_name": "CIBIL"
                }}
                ]
            }
            }
        }
        }
    ]
    }
}

}

如果我要查詢多個嵌套字段,它將變得很長。請建議使用任何其他方式查詢嵌套字段或多個AND

那么,您的優化應該始終從數據模型/映射開始,因為這主要是性能問題的原因,而不是查詢的原因。

話雖如此,您可以通過展平數據來避免嵌套查詢。 統一的數據模型將導致每個應用程序和帳戶元素一個文檔。

由於Elasticsearch是非關系數據存儲,因此對“冗余”數據進行索引完全可以。 不是懶惰的方法,而是處理這些類型的數據結構的常用方法。

樣本文檔1:

{
    "_index" : "bureau_data",
    "_type" : "_doc",
    "_id" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
    "_score" : 1.0,
    "_source" : {
      "userid" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
      "gender" : "MALE",
      "firstname" : "trsqlsz",
      "middlename" : "rgj",
      "lastname" : "ggksb",
      "mobilephone" : "2125954664",
      "dob" : "1988-06-28 00:00:00",
      "applicationid" : "c7fb0147-22fd-4a5e-8851-98241de6aa50",
      "createdat" : "2019-06-07 19:28:54",
      "updatedat" : "2019-06-07 19:28:55",
      "source" : "4",
      "source_name" : "EXPERIAN",
      "applicationcreditreportid" : "b67f9180-9bb6-485c-9cfc-e7ccf9a70a69",
      "applicationcreditreportaccountid" : "c5de28c4-cac9-4390-852a-96f143cb0b62",
      "currentbalance" : 418288,
      "institutionid" : "021d58b4-aba5-42c9-8d39-304a78d34aea",
      "accounttypeid" : "5",
      "institution_name" : "HDFC BANK",
      "account_type_name" : "Personal Loan"
    }
}

如果同一用戶創建另一個帳戶,則您將發送完全相同(“冗余”)的數據,但其他帳戶元素/數據除外,如下所示:

    {
    "_index" : "bureau_data",
    "_type" : "_doc",
    "_id" : "another, from es generated id",
    "_score" : 1.0,
    "_source" : {
      "userid" : "bda57e01-c564-4cdc-bb8d-79bd2db9d2f8",
      "gender" : "MALE",
      "firstname" : "trsqlsz",
      "middlename" : "rgj",
      "lastname" : "ggksb",
      "mobilephone" : "2125954664",
      "dob" : "1988-06-28 00:00:00",
      "applicationid" : "c7fb0147-22fd-4a5e-8851-98241de6aa50",
      "createdat" : "2019-06-07 19:28:54",
      "updatedat" : "2019-06-07 19:28:55",
      "source" : "4",
      "source_name" : "EXPERIAN",
      "applicationcreditreportid" : "b67f9180-9bb6-485c-9cfc-e7ccf9a70a69",
      "applicationcreditreportaccountid" : "the new id",
      "currentbalance" : 4711,
      "institutionid" : "foo",
      "accounttypeid" : "bar",
      "institution_name" : "foo bar",
      "account_type_name" : "foo baz"
    }
}

使用這種數據模型,您可以運行簡單的查詢來獲取結果:

    GET /my_index/_search
{
    "query": {
        "bool": {
            "must": [
            {
                "match":{
                    "account_type_name": "Credit Card"
                } 
            },
            {
                "match":{
                    "source_name": "CIBIL"
                } 
            }
            ]
        }
    }
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM