简体   繁体   中英

Adding Hive SerDe jar on SparkSQL Thrift Server

I have Hive tables that point to JSON files as contents and these tables need JSON SerDe jar (from here ) in order to query the tables. In the machine (or VM) hosting my Hadoop distro, I can simply execute in Hive or Beeline CLI:

ADD JAR /<local-path>/json-serde-1.0.jar;

and then I am able to perform SELECT queries on my Hive tables.

I need to use these Hive tables as data sources for my Tableau (installed in Windows, my host machine), so I start the Thrift server in Spark.

For Hive tables that does not contain JSON (and does not require the SerDe), Tableau can connect and read the tables easily.

When it comes to the Hive tables that contain JSON data, however, it looks like Tableau cannot find the Hive JSON SerDe jar, and I get the following error:

'java.lang.RuntimeException: MetaException(message:java.lang.ClassNotFoundException Class org.openx.data.jsonserde.JsonSerDe not found)'.

How do I add the Hive JSON SerDe jar so that Tableau can read the Hive JSON tables?

I am guessing you're using jdbc to connect tableau to hive.

When using the hive shell, hive bundles all the needed libraries (included the SerDe) from the hive client and builds a jar that is distributed and executed on the cluster. Unfortunately, the jdbc server does not do that, so you'll have to manually install and configure the SerDe on all the nodes and put it on the classpath of all the map/reduce nodes as well (copy the jar to all the nodes and add something like HADOOP_CLASSSPATH=$HADOOP_CLASSPATH:/location/of/your/serde.jar ). It may be necessary to restart yarn as well after that. It is quite inconvenient but that's how the jdbc driver works.

See https://issues.apache.org/jira/browse/HIVE-5275

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM