简体   繁体   English

在ElasticSearch中,我必须创建单个索引和多个类型还是使用单个类型的多个索引?

[英]In ElasticSearch i have to create single index and multiple types or multiple index with single types?

I am new in elastic search.I am using elastic search for big data. 我是弹性搜索的新手,正在使用弹性搜索大数据。

There is not join query in my application then which structure is best for my application? 我的应用程序中没有联接查询,那么哪种结构最适合我的应用程序?

I am working on elasticserach from past few days. 我过去几天一直在研究松紧带。 I would like to share my experience/learnings. 我想分享我的经验/教训。

1) If we moving from relational DB like MYSQL, SQL to ES , We need to maintain all relation among all data. 1)如果从MYSQL,SQL之类的关系数据库迁移到ES ,我们需要维护所有数据之间的所有关系。 Declare the primary key in different types or indexes, On basis of which you can perform Query DSL. 在不同的类型或索引中声明主键,您可以在此基础上执行查询DSL。

2) In case of if you dealing with millions data everyday, You need to design accordingly. 2)如果每天处理数百万个数据,则需要进行相应的设计。 Some people prefer duration based structure like Day, Week, Month wise. 有些人更喜欢基于持续时间的结构,例如Day, Week, Month Its totally depend on your use case. 它完全取决于您的用例。 For large data set (~ 1TB) you need to distribute your data in various of indexes and shards . 对于大数据集(〜1TB),您需要将数据分布在各种indexes和分shards

3) If you have small data set the it will be work in default settings too ( 5 shrads 1 replica ). 3)如果数据集较小,它也将在默认设置下工作( 5 shrads 1 replica )。 It will give you better If data set is small in your shards . 如果分shards数据集很小,它将为您带来更好的结果。

4) The JOIN query can be expensive in elasticsearch. 4) JOIN查询在elasticsearch中可能很昂贵。 And if you frequently performing it can be impact to your HEAP . 而且,如果您经常执行此操作,可能会影响您的HEAP So I would suggest prepare your data set with pre-cooked data (The result data which you getting when you perform join query in Relational DBs.) & document with unique ID. 因此,我建议您为数据集准备pre-cooked数据(在关系数据库中执行联接查询时得到的结果数据)和具有唯一ID的文档。 You can refer this . 你可以参考这个 Check here to look, How we can perform JOIN 查看此处以了解如何执行JOIN

5) There might be some points which you need to take care while designing your index: 5)在设计索引时,可能需要注意以下几点:

  1. Don't treat Elasticsearch like a database 不要将Elasticsearch像数据库一样对待
  2. Know your use case BEFORE you jump in 跳入之前先了解用例
  3. Organize your data wisely 明智地整理数据
  4. Make smart use of replicas 巧妙地使用副本
  5. Base your capacity plans on experiment 根据实验制定容量计划

6) Your wrong architecture can cause reindex which will be heavy cost with downtime. 6)您的错误体系结构可能导致reindex ,这将带来大量的停机成本。 Checkout this article to know about index designing and best practices. 查看本文以了解index designing和最佳实践。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM