Table not found error while loading DataFrame into a Hive partition

Question

I am trying to insert data into Hive table like this:

val partfile = sc.textFile("partfile")
val partdata = partfile.map(p => p.split(","))
val partSchema = StructType(Array(StructField("id",IntegerType,true),StructField("name",StringType,true),StructField("salary",IntegerType,true),StructField("dept",StringType,true),StructField("location",StringType,true)))
val partRDD = partdata.map(p => Row(p(0).toInt,p(1),p(2).toInt,p(3),p(4)))
val partDF = sqlContext.createDataFrame(partRDD, partSchema)

Packages I imported:

import org.apache.spark.sql.Row
import org.apache.spark.sql.Row
import org.apache.spark.sql.types.{StructType,StructField,StringType,IntegerType}
import org.apache.spark.sql.types._

This is how I tried to insert the dataframe into Hive partition:

partDF.write.mode(saveMode.Append).partitionBy("location").insertInto("parttab")

Im getting the below error even though I have the Hive Table:

org.apache.spark.sql.AnalysisException: Table not found: parttab;

Could anyone tell me what is the mistake I am doing here and how can I correct it ?

Answer 1

To write data to Hive warehouse, you need to initialize hiveContext instance.

Upon doing that, it will take confs from Hive-Site.xml (from classpath); and connects to underlying Hive warehouse.

HiveContext is an extension to SQLContext to support and connect to hive.

To do so, try this::

val hc = new HiveContext(sc)

And perform your append-query onn this instance.

partDF.registerAsTempTable("temp")

hc.sql(".... <normal sql query to pick data from table `temp`; and insert in to Hive table > ....")

Please make sure that the table parttab is under db - default .

If the table in under another db, table name should be specified as : <db-name>.parttab

If you need to directly save the dataframe in to hive; use this:

df.saveAsTable("<db-name>.parttab")

Table not found error while loading DataFrame into a Hive partition

Question

1 answers

solution1
1 ACCPTED 2017-06-23 05:30:04

Table not found error while loading DataFrame into a Hive partition

Question

1 answers

solution1 1 ACCPTED 2017-06-23 05:30:04

solution1
1 ACCPTED 2017-06-23 05:30:04