简体   繁体   English

Apache Spark上的AMPLab Shark

[英]AMPLab Shark on Apache Spark

As per documentation, 根据文档,

"Apache Spark is a fast and general engine for large-scale data processing." “ Apache Spark是用于大规模数据处理的快速通用引擎。”

"Shark is an open source distributed SQL query engine for Hadoop data." “ Shark是用于Hadoop数据的开源分布式SQL查询引擎。”

And Shark uses Spark as a dependency. Shark使用Spark作为依赖项。

My question is, Is Spark just parses HiveQL into Spark jobs or does anything great if we use Shark for fast response on analytical queries ? 我的问题是,Spark只是将HiveQL解析为Spark作业,还是如果我们使用Shark对分析查询进行快速响应,那么它有什么用?

Yes, Shark uses the same idea as Hive but translates HiveQL into Spark jobs instead of MapReduce jobs. 是的,Shark使用与Hive相同的想法,但是将HiveQL转换为Spark作业而不是MapReduce作业。 Please, read pages 13-14 of this document for architectural differences between these two. 请阅读文档的第13-14页,了解两者之间的体系结构差异。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM