How do I create a Hive External table from AVRO files writen using databricks?

Question

The code below is how it was written into HDFS using scala. What is the HQL syntax to create a Hive table to query this data?

import com.databricks.spark.avro._
val path = "/user/myself/avrodata"
dataFrame.write.avro(path)

The examples I find require providing an avro.schema.literal to describe the schema or an avro.schema.url to the actual avro schema.

In the spark-shell all I would need to do to read this is:

scala> import com.databricks.spark.avro._
scala> val df = sqlContext.read.avro("/user/myself/avrodata")
scala> df.show()

Answer 1

So I cheated to get this to work. Basically I created a temporary table and used HQL to create and insert the data from the temp table. This method uses the metadata from the temporary table and creates the avro target table which I wanted to create and populate. If the data frame can create a temporary table from its schema, why could it not save the table as avro?

dataFrame.registerTempTable("my_tmp_table")
sqlContext.sql(s"create table ${schema}.${tableName} stored as avro as select * from ${tmptbl}")

How do I create a Hive External table from AVRO files writen using databricks?

Question

1 answers

solution1
0 2016-08-21 16:59:38

How do I create a Hive External table from AVRO files writen using databricks?

Question

1 answers

solution1 0 2016-08-21 16:59:38

solution1
0 2016-08-21 16:59:38