I'm trying to set up a system where I can determine if a claim is eligible or ineligible based on similar past claims. You can think of it as expensing a purchase.
Say I have the existing documents:
PUT claim/_bulk
{ "create": { } }
{ "company_id": "Google","category":"office_equipment", "description":"stand up desk", "status": "approved"}
{ "create": { } }
{ "company_id": "Google","category":"office_equipment", "description":"computer chair", "status": "approved"}
{ "create": { } }
{ "company_id": "Apple","category":"office_equipment", "description":"keyboard", "status": "approved"}
{ "create": { } }
{ "company_id": "Samsung","category":"office_equipment", "description":"ps4", "status": "rejected"}
If someone tries to file a new claim with these attributes:
description: "wooden desk"
category: "office_equipment"
What kind of query or aggregation would I need to determine whether that claim is eligible (aka status == "approved") or ineligible (aka status == "rejected")? Would there be a query that would return a confidence score?
I'm looking for something like this as the output:
status: "approved"
confidence: 0.8
And if the confidence is too low for a claim that has no significant relevance to any existing claims, it would just be
status: approved (or whatever)
confidence: 0
in which case I would just manually process it.
I feels like a simple boolean query
should do the trick.
I assumed category would be an exact match. So I choose to use the filter (which is going to select documents but do not impact the score)
Then when for a should, which mean it will try to match but not necessarily all the match are going to be fulfilled, and each match is going to influence the score.
I also took the liberty to set a min_score
to filter documents that are too low.
using the following query:
GET 75292303/_search
{
"min_score": 0.8,
"query": {
"bool": {
"filter": [
{
"term": {
"category.keyword": "office_equipment"
}
}
],
"should": [
{
"match": {
"description": "wooden desk"
}
}
]
}
}
}
I got the following results.
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.93171775,
"hits": [
{
"_index": "75292303",
"_id": "KITtBoYBArbKoMpIpUdh",
"_score": 0.93171775,
"_source": {
"company_id": "Google",
"category": "office_equipment",
"description": "stand up desk",
"status": "approved"
}
}
]
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.