[英]What is the best way to update cache in elasticsearch
I'm using elasticsearch index as a cache table.我使用 elasticsearch 索引作为缓存表。 My document structure is the following:我的文档结构如下:
{
"mappings": {
"dynamic": False,
"properties": {
"query_str": {"type": "text"},
"search_results": {
"type": "object",
"enabled": false
},
"query_embedding": {
"type": "dense_vector",
"dims": 768,
},
}
}
The cache search is performed via embedding vector similarity.缓存搜索是通过嵌入向量相似性来执行的。 So if the embedding of the new query is close enough to a cached one, it is considered as a cache hit, and search_results
field is returned to the user.因此,如果新查询的嵌入与缓存的查询足够接近,则将其视为缓存命中,并将search_results
字段返回给用户。
The problem is that I need to update cached results about once an hour.问题是我需要大约每小时更新一次缓存结果。 I wish my service won't lose the ability to use cache efficiently while updating procedure, so I'm not sure which one of solutions is the best:我希望我的服务在更新过程时不会失去有效使用缓存的能力,所以我不确定哪种解决方案是最好的:
I would go with #2 as everytime you update a document the cache is flushed.我会使用#2 go,因为每次更新文档时都会刷新缓存。
There is an elegant way to swap indices:有一种优雅的方式来交换索引:
You have an alias that points to your current index, you fill a new index with the fresh records, and then you point this alias to the new index.您有一个指向当前索引的别名,用新记录填充一个新索引,然后将该别名指向新索引。
Something like this:是这样的:
POST _aliases
{
"actions": [
{
"add": {
"index": "items-2022-11-26-001",
"alias": "items"
}
}
]
}
POST _aliases
{
"actions": [
{
"remove": {
"index": "items-2022-11-26-001",
"alias": "items"
}
},
{
"add": {
"index": "items-2022-11-26-002",
"alias": "items"
}
}
]
}
You run all your queries against "items" alias that will act as an index.您针对将充当索引的“项目”别名运行所有查询。
References:参考:
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.