简体   繁体   English

Elasticsearch:拥有更多值或更多字段更好吗?

[英]Elasticsearch: better to have more values or more fields?

Suppose to have an index with documents describing vehicles. 假设索引中包含描述车辆的文件。

Your index needs to deal with two different type of vehicles: motorcycle and car. 您的索引需要处理两种不同类型的车辆:摩托车和汽车。

Which of the following mapping is better from a performance point of view? 从性能角度来看,以下哪个映射更好? (nested is required for my purposes) (出于我的目的,需要嵌套)

    "vehicle": {
        "type": "nested",
        "properties": {
            "car": {
                "properties": {
                    "model": {
                        "type": "string"
                    },
                    "cost": {
                        "type": "integer"
                    }
                }
            },
            "motorcycle": {
                "properties": {
                    "model": {
                        "type": "string"
                    },
                    "cost": {
                        "type": "integer"
                    }
                }
            }
        }
    }

or this one: 或这一个:

"vehicle": {
    "type": "nested",
    "properties": {

        "model": {
            "type": "string"
        },
        "cost": {
            "type": "integer"
        },
        "vehicle_type": {
            "type": "string"     ### "car", "motorcycle"
        }

    }
}

The second one is more readable and thin. 第二个更易读,更薄。

But the drawback that I'll have is that when I make my queries, if I want to focus only on "car", I need to put this condition as part of the query. 但是我的缺点是,当我进行查询时,如果只想关注“汽车”,则需要将此条件作为查询的一部分。

If I use the first mapping, I just need to have a direct access to the stored field, without adding overhead to the query. 如果使用第一个映射,则只需要直接访问存储的字段,而不会增加查询的开销。

The first mapping, where cars and motorcycles are isolated in different fields, is more likely to be faster. 第一个映射是将汽车和摩托车隔离在不同的领域,这种映射更有可能更快。 The reason is that you have one less filter to apply as you already know, and because of the increased selectivity of the queries (eg less documents for a given value of vehicle.car.model than just vehicle.model ) 原因是,您已经知道要应用的过滤器更少,并且由于查询的选择性增加(例如,给定的vehicle.car.model值的文档比vehicle.model

Another option would be to create two distinct indexes car and motorcycle , possibly with the same index template . 另一种选择是创建两个不同的索引carmotorcycle ,可能具有相同的索引模板

In Elasticsearch, a query is processed by a single-thread per shard. 在Elasticsearch中,每个分片由单线程处理查询。 That means, if you split your index in two, and query both in a single request , it will be executed in parallel. 这意味着,如果将索引一分为二,并在单个请求中查询两个索引,它将并行执行。

So, when needed to query only one of cars or motorcycles, it's faster simply because indexes are smaller. 因此,当仅需要查询汽车或摩托车之一时,由于索引较小,因此速度更快。 And when it comes to query both cars and motorcycles it could also be faster by using more threads. 当涉及到汽车和摩托车时,通过使用更多线程也可以更快。

EDIT: one drawback of the later option you should know, the inner lucene dictionary will be duplicated, and if values in cars and motorcycles are quite identical, it doubles the list of indexed terms. 编辑:您应该知道的后一种选择的一个缺点是,内部lucene词典将被复制,并且如果汽车和摩托车中的值完全相同,它会使索引项的列表加倍。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Elasticsearch中由两个或多个字段聚合 - Aggregate by two or more fields in elasticsearch elasticsearch赋予不同领域和方案更多的权重 - elasticsearch giving more weight to different fields and scenarios Elasticsearch映射具有1000多个字段 - Elasticsearch mapping with more than 1000 fields 如何在Elasticsearch索引中的两个或多个字段上获取聚合以获取单个存储桶中的唯一值 - how to get aggregations on two or more fields in elasticsearch index to get the unique values in single bucket 为什么 elasticsearch 只为一个字段返回 5 个高亮显示,该字段的值数组根据查询具有更多匹配项? - Why is elasticsearch only returning 5 highlights for a field with an array of values that have more matches based on the query? Elasticsearch-匹配更多单词时文档得分更高 - Elasticsearch - Document better score when more words matching 我们是否可以避免将所有字段映射到 springdata 中的实体类以进行弹性搜索,因为我在 json 文档中有 100 多个字段? - Can we avoid mapping all the fields to entity class in springdata for elasticsearch, as I have more than 100 field in the json document? Elasticsearch:具有多个字段的单个“更喜欢此”查询与具有单个字段的多个“更喜欢此”查询 - Elasticsearch: Single “more-like-this” query with multiple fields vs Multiple “more-like-this” query with single fields 在Elasticsearch聚合中获取更多字段而不是仅一个字段 - Get more fields instead of only one in Elasticsearch aggregation ElasticSearch如何获取其中包含10个或更多字段的文档? - ElasticSearch how to get docs with 10 or more fields in them?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM