简体   繁体   中英

Hive version compatible with Spark

Everyday I get more and more confused. I am learning to use spark with hive and every tutorial I find on the internet vaguely explains the relationship First of all what does it mean when people say hive compatible with spark .I downloaded prebuilt spark and it's version is 2.1.1 and I downloaded hive 2.1.1. My goal is to access hive metastore from spark but everytime I run spark query I get

Caused by: java.lang.reflect.InvocationTargetException
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

Which according this website

If you have a metastore version mismatch, either or both of the last two SQL statements will result in this error message: Error: java.lang.reflect.InvocationTargetException (state=,code=0)

Where I am confused is when people say hive spark compatibility do they mean spark version and hive version? which in my case both are 2.1.1 ( yet I am getting this error) or they mean metastore database schema version and hive-metastore jar version inside spark/jars folder ?

Now my hive metastore-schema version is 2.1.0 and I have hive-metastore-1.2.1.spark2.jar , So do I need to change hive-metastore-schema version to 1.2.1 ? According to this website

For handling Spark 2.1.0, which is currently shipped with Hive 1.2 jar, users need to use a Hive remote metastore service (hive.metastore.uris), where metastore service is started with hive.metastore.schema.verification as TRUE for any Spark SQL context. This will force the Spark Client to talk to a higher version of the Hive metastore (like Hive 2.1.0), using lower Hive jars (like Hive 1.2), without modifying or altering the existing Hive schema of the metastore database.

I do have hive-schema-verification set to true and still get same error.Also please take your time to check spark-website , where they say

spark.sql.hive.metastore.version 1.2.1 ( Version of the Hive metastore. Available options are 0.12.0 through 1.2.1.)

.Wrapping up my question, my goal is to 1) understand meaning behind hive compatible with spark 2) connect to hive metastore using spark Please try to elaborate your answer or be kind to provide me link where I can find my answers. I am genuingly confused.

Hive with Spark:如果您遇到与 Metastore 版本相关的错误,则应在 spark defaults.conf 中设置以下元存储 jar 和版本,或在提交时传递,每个 conf 都是单独的参数 --conf spark.sql.hive.metastore.jars= /home/hadoopuser/hivemetastorejars/* --conf spark.sql.hive.metastore.version=2.3.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM