简体   繁体   English

有没有一种方法可以优化此CrateDB关系查询?

[英]Is there a way to optimize this CrateDB relational query?

I am testing CrateDB with a data set of 80 million events sent from a web app, both as a normalized, relational solution, and also as a denormalized, single database solution. 我正在使用从Web应用程序发送的8000万个事件的数据集来测试CrateDB,这既是规范化的关系解决方案,也是非规范化的单个数据库解决方案。

I imported all 80 million denormalized events into a table, and ran the following aggregation query: 我将所有8000万个非规范化事件导入到一个表中,并运行以下聚合查询:

select productName, SUM(elapsed)/60 as total_minutes from denormalized
where country_code = 'NL' AND eventType = 'mediaPlay' 
group by productName
order by total_minutes desc
limit 1000;

and the query took .009 seconds. 并且查询花费了0.009秒。 Wowza! Wowza! CrateDB is blazing fast! CrateDB飞速发展!

Then I imported the sessionwide docs into one table called "sessions", and all the individual event docs in each session into another table called "events", and ran the following query: 然后,我将整个会话范围的文档导入一个称为“会话”的表中,并将每个会话中的所有单个事件文档均导入另一个名为“事件”的表中,并运行以下查询:

select e.productName, SUM(e.elapsed)/60 as total_minutes from sessions s
join events e ON e.sessionGroup = s.sessionGroup
where s.country_code = 'NL' AND e.eventType = 'mediaPlay' 
group by e.productName
order by total_minutes desc
limit 1000;

which took 21 seconds. 花了21秒。

My question is, is there any way to get faster relational performance, maybe by creating indexes, or changing the query somehow? 我的问题是,是否可以通过创建索引或以某种方式更改查询来获得更快的关系性能?

Tangential thought: We have been using Elasticsearch for analytics, obviously denormalizing the data, and it's plenty fast, but CrateDB seems to offer everything Elasticsearch does (fast queries on denormalized data, clustering, dynamic schema, full text search), plus the additional advantages of: 切线思想:我们一直在使用Elasticsearch进行分析,显然是对数据进行非规范化,而且速度很快,但是CrateDB似乎提供了Elasticsearch所做的一切(对非规范化数据的快速查询,集群,动态模式,全文搜索),以及其他优势的:

  • better SQL support 更好的SQL支持
  • the option to deploy relational solutions on small data sets (wonderful to standardize on one DB, no context-switching or ramp up for developers who know SQL). 在小型数据集上部署关系解决方案的选项(对于在一个DB上实现标准化非常有用,对于知道SQL的开发人员而言,无需进行上下文切换或扩展)。

What CrateDB version are you using? 您正在使用哪个CrateDB版本? If it is < 3.0, than upgrading will probably boost the join query a lot, see https://crate.io/a/lab-notes-how-we-made-joins-23-thousand-times-faster-part-three/ . 如果它小于3.0,则升级可能会大大提高连接查询的效率,请参阅https://crate.io/a/lab-notes-how-we-made-joins-23-thousand-times-faster-part-三/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM