简体   繁体   中英

Spark submit throws error while using Hive tables

i have a strange error, i am trying to write data to hive, it works well in spark-shell, but while i am using spark-submit, it throwing database/table not found in default error.

Following is the coding i am trying to write in spark-submit , i am using custom build of spark 2.0.0

 val sqlContext = new org.apache.spark.sql.SQLContext(sc)
sqlContext.table("spark_schema.iris_ori")

Following is the command i am using,

/home/ec2-user/Spark_Source_Code/spark/bin/spark-submit --class TreeClassifiersModels --master local[*] /home/ec2-user/Spark_Snapshots/Spark_2.6/TreeClassifiersModels/target/scala-2.11/treeclassifiersmodels_2.11-1.0.3.jar /user/ec2-user/Input_Files/defPath/iris_spark SPECIES~LBL+PETAL_LENGTH+PETAL_WIDTH RAN_FOREST 0.7 123 12

Following is the Error,

16/05/20 09:05:18 INFO SparkSqlParser: Parsing command: spark_schema.measures_20160520090502 Exception in thread "main" org.apache.spark.sql.AnalysisException: Database 'spark_schema' does not exist; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:37) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:195) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:360) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:464) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:458) at TreeClassifiersModels$.main(TreeClassifiersModels.scala:71) at TreeClassifiersModels.main(TreeClassifiersModels.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:726) at org.apache.spark.deploy.SparkSubmit $.doRunMain$1(SparkSubmit.scala:183) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:208) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:122) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

The issue was because of the deprecation happened on Spark Version 2.0.0. Hive Context was deprecated in Spark 2.0.0. To read/Write Hive tables on Spark 2.0.0 we need to use Spark session as follows.

val sparkSession = SparkSession.withHiveSupport(sc)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM