简体   繁体   English

在 elasticsearch 中更新缓存的最佳方法是什么

[英]What is the best way to update cache in elasticsearch

I'm using elasticsearch index as a cache table.我使用 elasticsearch 索引作为缓存表。 My document structure is the following:我的文档结构如下:

{
        "mappings": {
            "dynamic": False,
            "properties": {
                "query_str": {"type": "text"},
                "search_results": {
                    "type": "object", 
                    "enabled": false
                },
                "query_embedding": {
                    "type": "dense_vector",
                    "dims": 768,
                },
               
        }
    }

The cache search is performed via embedding vector similarity.缓存搜索是通过嵌入向量相似性来执行的。 So if the embedding of the new query is close enough to a cached one, it is considered as a cache hit, and search_results field is returned to the user.因此,如果新查询的嵌入与缓存的查询足够接近,则将其视为缓存命中,并将search_results字段返回给用户。

The problem is that I need to update cached results about once an hour.问题是我需要大约每小时更新一次缓存结果。 I wish my service won't lose the ability to use cache efficiently while updating procedure, so I'm not sure which one of solutions is the best:我希望我的服务在更新过程时不会失去有效使用缓存的能力,所以我不确定哪种解决方案是最好的:

  1. Sequentially update documents one-by-one, so the index won't be destroyed.一个接一个地顺序更新文档,这样索引就不会被破坏。 The drawback of this solution I afraid is the fact, that every update causes index rebuilding, so the cache requests will become slow这个解决方案的缺点恐怕是每次更新都会导致索引重建,所以缓存请求会变慢
  2. Create entirely new index with new results and then somehow swap current cache index with the new one.使用新结果创建全新索引,然后以某种方式将当前缓存索引与新索引交换 The drawbacks I see are a) I've found no elegant way to swap indexes b) Users will get their cached resuts lately than in solution (1)我看到的缺点是 a) 我没有找到交换索引的优雅方法 b) 与解决方案 (1) 相比,用户最近会得到他们的缓存结果

I would go with #2 as everytime you update a document the cache is flushed.我会使用#2 go,因为每次更新文档时都会刷新缓存。

There is an elegant way to swap indices:有一种优雅的方式来交换索引:

You have an alias that points to your current index, you fill a new index with the fresh records, and then you point this alias to the new index.您有一个指向当前索引的别名,用新记录填充一个新索引,然后将该别名指向新索引。

Something like this:是这样的:

  1. Current index name is items-2022-11-26-001当前索引名称是 items-2022-11-26-001
  2. Create alias items pointing to items-2022-11-26-001创建指向 items-2022-11-26-001 的别名项目
POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "items-2022-11-26-001",
        "alias": "items"
      }
    }
  ]
}
  1. Create new index with fresh data items-2022-11-26-002使用新数据项创建新索引-2022-11-26-002
  2. When it finishes, now point the items alias to items-2022-11-26-002完成后,现在将项目别名指向 items-2022-11-26-002
POST _aliases
{
  "actions": [
    {
      "remove": {
        "index": "items-2022-11-26-001",
        "alias": "items"
      }
    },
    {
      "add": {
        "index": "items-2022-11-26-002",
        "alias": "items"
      }
    }
  ]
}
  1. Delete items-2022-11-26-001删除项目-2022-11-26-001

You run all your queries against "items" alias that will act as an index.您针对将充当索引的“项目”别名运行所有查询。

References:参考:

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在ElasticSearch中管理关系的最佳方法是什么? - What is the best way to manage relations in ElasticSearch? 在Django Restframework中使用elasticsearch的最佳方法是什么 - what is the best way to use elasticsearch in Django Restframework 什么是在laravel中向elasticsearch添加数据的最佳方法 - what is best way to add data to elasticsearch in laravel 压缩 Elasticsearch 快照的最佳方法是什么? - What is the best way to compress Elasticsearch snapshot? 使用 RestHighLevelClient 执行 elasticsearch 查询的最佳方法是什么 - What is the best way to execute elasticsearch queries with RestHighLevelClient 增加 ElasticSearch 集群磁盘空间的最佳方法是什么? - what is the best way to increase diskpace for ElasticSearch cluster ? 在Elasticsearch中查询此字段的最佳方法是什么 - What is the best way of querying this field in Elasticsearch 在ElasticSearch上索引聚合数据的最佳方法是什么 - What is the best way to index aggregate data on ElasticSearch 将 docker 通知发送到 Elasticsearch 的最佳方式是什么? - What is the best way to send docker notifications to Elasticsearch? 在Elasticsearch上索引数据的最佳方法是什么? - What is the best way to index data on elasticsearch?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM