[英]Elasticsearch: Query or aggregation for determining if claim is eligible or ineligible
我正在嘗試建立一個系統,在該系統中,我可以根據過去類似的索賠來確定索賠是否符合條件。 您可以將其視為支出購買。
假設我有現有文件:
PUT claim/_bulk
{ "create": { } }
{ "company_id": "Google","category":"office_equipment", "description":"stand up desk", "status": "approved"}
{ "create": { } }
{ "company_id": "Google","category":"office_equipment", "description":"computer chair", "status": "approved"}
{ "create": { } }
{ "company_id": "Apple","category":"office_equipment", "description":"keyboard", "status": "approved"}
{ "create": { } }
{ "company_id": "Samsung","category":"office_equipment", "description":"ps4", "status": "rejected"}
如果有人嘗試使用這些屬性提出新的索賠:
description: "wooden desk"
category: "office_equipment"
我需要什么樣的查詢或聚合來確定該聲明是否符合條件(又名狀態==“已批准”)或不合格(又名狀態==“拒絕”)? 是否會有返回置信度分數的查詢?
我正在尋找像 output 這樣的東西:
status: "approved"
confidence: 0.8
如果對與任何現有聲明都沒有顯着相關性的聲明的置信度太低,它只會是
status: approved (or whatever)
confidence: 0
在這種情況下,我只會手動處理它。
我覺得一個簡單的boolean query
應該可以解決問題。
我假設類別是完全匹配的。 所以我選擇使用過濾器(這將用於 select 個文檔但不影響分數)
然后when for a should,意思是會嘗試匹配,但不一定所有的匹配都會完成,每一個匹配都會影響分數。
我還冒昧地設置了一個min_score
來過濾太低的文檔。
使用以下查詢:
GET 75292303/_search
{
"min_score": 0.8,
"query": {
"bool": {
"filter": [
{
"term": {
"category.keyword": "office_equipment"
}
}
],
"should": [
{
"match": {
"description": "wooden desk"
}
}
]
}
}
}
我得到以下結果。
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.93171775,
"hits": [
{
"_index": "75292303",
"_id": "KITtBoYBArbKoMpIpUdh",
"_score": 0.93171775,
"_source": {
"company_id": "Google",
"category": "office_equipment",
"description": "stand up desk",
"status": "approved"
}
}
]
}
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.