简体   繁体   中英

How can we load a hive table created over json data to spark dataframe using spark.sql?

I tried querying hive table over json data (using spark.sql) into a pyspark dataframe and the following error occured

   ERROR log: error in initSerDe: java.lang.ClassNotFoundException Class org.apache.hive.hcatalog.data.JsonSerDe not foundjava.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.data.JsonSerDe not found

Try adding json serde jar inside spark.sql , check below code.

If it is managed table, You can use below code.

spark.sql("ADD JAR /path-to/hive-json-serde.jar; SELECT * FROM TABLE")

If it is an external table you can directly load data by passing hdfs path.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM