简体繁体中英

Can someone explain this : “Spark SQL supports a different use case than Hive.”

原文 2014-08-27 18:35:13 0 1 hadoop/ hive/ apache-spark/ shark-sql

I am referring to the following link : Hive Support for Spark

It says :

"Spark SQL supports a different use case than Hive."

I am not sure why that will be the case. Does this mean as a Hive user i cannot use Spark execution engine through Spark SQL?

Some Questions:

Spark SQL uses Hive Query parser. So it will ideally support all of Hive functionality.
Will it use Hive Metastore?
Will Hive use the Spark optimizer or will it build its own optimizer?
Will Hive translate MR Jobs into Spark? Or use some other paradigm?

1 answers

Spark SQL is intended to allow the use of SQL expressions on top of Spark's machine learning libraries. It allows you to use SQL as a tool (among others) for building advanced analytic (eg ML) applications. It is not a drop-in replacement for Hive, which is really best at batch processing/ETL.

However, there is also work ongoing upstream to allow Spark to serve as a general data processing backend for Hive. That work is what would allow you to take full advantage of Spark for Hive use cases specifically.

Hive. Can not refer to partitions in where clause

Differences between Apache Sqoop and Hive. Can we use both together?

Does Spark SQL (version 1.1.0) supports hive index?

How to use hive hook in spark sql

Hive Query Giving Different Results Than SQL

Is Hive faster than Spark?

Performance benchmarking between Hive (on Tez) and Spark for my particular use case

Is really Hive on Tez with ORC performance better than Spark SQL for ETL?

How can I find a particular column name within all tables in Hive.?

Spark SQL build for hive?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Hive. Can not refer to partitions in where clause Differences between Apache Sqoop and Hive. Can we use both together? Does Spark SQL (version 1.1.0) supports hive index? How to use hive hook in spark sql Hive Query Giving Different Results Than SQL Is Hive faster than Spark? Performance benchmarking between Hive (on Tez) and Spark for my particular use case Is really Hive on Tez with ORC performance better than Spark SQL for ETL? How can I find a particular column name within all tables in Hive.? Spark SQL build for hive?

Related Tags

Can someone explain this : “Spark SQL supports a different use case than Hive.”

Question

1 answers

solution1 1 ACCPTED 2014-08-27 21:47:30

solution1
1 ACCPTED 2014-08-27 21:47:30