简体   繁体   English

用于聚合/报告数据的 Elasticsearch 与 RDMBs

[英]Elasticsearch vs RDMBs for Aggregations/Reporting Data

Has anyone has experience switching between Elasticsearch and a relational DB like mysql/postgres/?有没有人有在 Elasticsearch 和关系数据库(如 mysql/postgres/)之间切换的经验? What are the pros/cons of both?两者的优缺点是什么?

Background: looking to build a dashboard UI to show store/item related metrics and need the correct tool on the backend side that provides flexibility in queries (Imagine that the UI has selectors for date ranges and then the UI shows top items sold, total sales, etc.) in different time based charts.背景:希望构建一个仪表板 UI 来显示与商店/商品相关的指标,并且需要在后端提供正确的工具来提供查询的灵活性(想象一下 UI 具有日期范围的选择器,然后 UI 显示销量最高的商品、总销售额等)在不同的基于时间的图表中。 Some other notes are that we are just going to be using aggregations/nested aggregations (wouldn't be taking advantage of text search) around stores or items.其他一些注意事项是我们将仅在商店或商品周围使用聚合/嵌套聚合(不会利用文本搜索)。

I know you could use both but which one is preferable in terms of我知道你可以同时使用这两种但哪一种更可取

  1. performance?表现? I imagine that they would be largely similar我想它们会在很大程度上相似
  2. durability?耐久性? I imagine elasticsearch and it automatically replicates data我想象elasticsearch它会自动复制数据
  3. maintenance?维护? I imagine elasticsearch would be worse (maintaining a cluster vs maintaining a single node)我想 elasticsearch 会更糟(维护集群 vs 维护单个节点)
  4. cost?成本? I imagine an elasticsearch cluster storing the same amount of data would cost more because of replication我想存储相同数量数据的弹性搜索集群会因为复制而花费更多
  5. development work?开发工作? I imagine elasticsearch would cause development to take longer using elasticsearch's custom queries vs writing APIs around sql queries我想使用 elasticsearch 的自定义查询与围绕 sql 查询编写 API 相比,elasticsearch 会导致开发时间更长

Are these assumptions correct?这些假设正确吗? Are there other dbs/data stores that I should consider over these 2 options?对于这两个选项,我还应该考虑其他数据库/数据存储吗?

Based on my experience Elastic Search is a superb tool for :根据我的经验,Elastic Search 是一个极好的工具:

  1. Search搜索
  2. Real-time data Aggregation实时数据聚合
  3. Real-time reporting with extensive filtering support具有广泛过滤支持的实时报告

We are also using Elastic Search for powering our real-time reports having extensive filter options (like date-range, status, etc).我们还使用 Elastic Search 为我们的实时报告提供支持,这些报告具有广泛的过滤选项(如日期范围、状态等)。

We compared aggregation performance of ES and MongoDB with similar set of machines and for aggregating 5 million records mongo-db took around 12 Sec while ES took time under 1 sec.我们将 ES 和 MongoDB 的聚合性能与类似的机器集进行了比较,聚合 500 万条记录 mongo-db 花费了大约 12 秒,而 ES 花费的时间不到 1 秒。

performance?表现? I imagine that they would be largely similar我想它们会在很大程度上相似

If you have pure aggregation use case on loads of data requiring extensive filtering, searching etc then the performance of ES would be unmatched.如果您在需要大量过滤、搜索等的数据负载上有纯聚合用例,那么 ES 的性能将是无与伦比的。

durability?耐久性? I imagine elastic search and it automatically replicates data我想象弹性搜索,它会自动复制数据

Yes ES do have inherent replication support, as it is a distributed system.是的 ES 确实具有固有的复制支持,因为它是一个分布式系统。

maintenance?维护? I imagine elasticsearch would be worse (maintaining a cluster vs maintaining a single node)我想 elasticsearch 会更糟(维护集群 vs 维护单个节点)

Definitely distributed systems demand more maintenance but you can use the Hosted version of ES (eg AWS Elasti-cache) as well绝对分布式系统需要更多维护,但您也可以使用 ES 的托管版本(例如 AWS Elasti-cache)

cost?成本? I imagine an elasticsearch cluster storing the same amount of data would cost more because of replication我想存储相同数量数据的弹性搜索集群会因为复制而花费更多

Considering cluster is required with replication support as well.考虑集群也需要复制支持。 Infra cost will be larger.基础设施成本会更大。

development work?开发工作? I imagine elasticsearch would cause development to take longer using elasticsearch's custom queries vs writing APIs around sql queries我想使用 elasticsearch 的自定义查询与围绕 sql 查询编写 API 相比,elasticsearch 会导致开发时间更长

It depends on the experience with ES Since Mysql has been around for long, most dev folks are skilled with that.这取决于使用 ES 的经验 因为 Mysql 已经存在很长时间了,大多数开发人员都熟悉它。 Any new technology has it's learning curve.任何新技术都有其学习曲线。

Keep in mind :请记住:

  1. ES is not an ACID compliant datastore. ES 不是符合 ACID 的数据存储。
  2. No Transactions support is there.没有事务支持。 If your system is purely transactional, then you may require relational-db as a read/write store and ES for powering aggregation use cases.如果您的系统纯粹是事务性的,那么您可能需要关系数据库作为读/写存储和 ES 来支持聚合用例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM