简体   繁体   中英

Connecting R to Hive on a Remote Server

I am trying to connect RStudio to data in Hive that I am accessing through Hue on a remote server.

This is my current code:

 options( java.parameters = "-Xmx20g" )
 library("DBI")
 Sys.setenv(JAVA_HOME='C:\\Program Files\\Java\\jre1.8.0_131')
 library("rJava")
 library("RJDBC")
 drv <- JDBC("org.apache.hadoop.hive.jdbc.HiveDriver",
            c(list.files("C:/Users/xxx/Desktop",pattern="jar$",full.names=T),
              list.files("C:/Users/xxx/Desktop",pattern="jar$",full.names=T)))

I downloaded the Hive JAR files to my desktop and the xxx is the correct file path. Everything up to here runs fine in RStudio.

However, when I run the following line:

 conn <- dbConnect(drv, "jdbc:hive2://IP ADDRESS", "usrnm", "password")

In which the IP address, username, and password are all correct, I experience the following error:

 Error in .jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1],  : java.lang.NoClassDefFoundError: org/apache/thrift/TBase

Any help would be much appreciated. Thank you so much.

I guess you do not have the missing jars/libraries .

Thus it is unable to find the TBase class.

Add the hive-exec-xxxjar and see if it works fine for you.

Check this link http://snacktrace.com/artifacts/org.apache.hive/hive-exec/1.1.1/org.apache.thrift.TBase

Hope it helps!!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM