简体   繁体   English

使用JDBC将R连接到Hive

[英]Connecting R to Hive using JDBC

I am trying to connect R to Hive cluster using RJDBC package. 我正在尝试使用RJDBC软件包将R连接到Hive群集。

The code I have written is: 我写的代码是:

drv <- JDBC(driverClass = "org.apache.hive.jdbc.HiveDriver", 
        classPath = list.files("C:/hive-jdbc/hive-jdbc-0.10.0.jar",
                               pattern="jar$",full.names=T),
        identifier.quote="'")

I have added "C:/hive-jdbc" to my system path variable as well. 我也将“ C:/ hive-jdbc”添加到了我的系统路径变量中。

But I am getting the following error: 但是我收到以下错误:

Error in path.expand(unlist(strsplit(classPath, .Platform$path.sep))) : 
  invalid 'path' argument

Can some one help me with this? 有人可以帮我弄这个吗?

In

classPath = list.files("C:/hive-jdbc/hive-jdbc-0.10.0.jar",
                               pattern="jar$",full.names=T)

you use list.files . 您使用list.files The first argument to list.files should be a folder , you seem to have given it a jar file . list.files的第一个参数应该是一个文件夹 ,您似乎给了它一个jar 文件 What is the output of just that list.files function on your system? 什么是只是输出list.files功能,您的系统上? It's probably character(0) . 可能是character(0) That screws up the classPath . classPathclassPath Fix that - and its not clear what you want the value of the classPath parameter to be here. 解决该问题-并不清楚您希望classPath参数的值在此处。 If you want it to be all the .jar files in a folder, then 如果希望它是文件夹中的所有.jar文件,则

list.files("C:/wherever/", pattern="\.jar$", full.names=TRUE)

should do it. 应该这样做。 If its just the one jar file, just put it in: 如果只是一个jar文件,则将其放入:

classPath="C:/hive-jdbc/hive-blahlah-999.jar"

in the call. 在通话中。 ie, keep it simple! 即,保持简单!

In answer to Prateek - "Class not found" as it is not in the jar file: you need more jar files in your class path. 回答Prateek-“找不到类”,因为它不在jar文件中:在类路径中需要更多的jar文件。 for me this was: 对我来说是:

/usr/lib/hive/lib/hive-jdbc.jar
/usr/lib/hive/lib/libthrift-0.9.2.jar
/usr/lib/hive/lib/hive-service.jar
/usr/lib/hive/lib/httpclient-4.2.5.jar
/usr/lib/hive/lib/httpcore-4.2.5.jar
/usr/lib/hive/lib/hive-jdbc-standalone.jar
/usr/lib/hadoop/client/hadoop-common.jar

(some of these file refs are symbolic links to the real file - take the real file!) I also wrote a basic blow-by-blow article on getting this working: https://pygot.wordpress.com/2016/10/13/connecting-r-studio-to-hadoop-via-hive/ (其中一些文件引用是指向实际文件的符号链接-以真实文件为准!)我还写了一篇有关如何使此工作正常运行的基本文章: https : //pygot.wordpress.com/2016/10/ 13 /连接-R-工作室到Hadoop的通蜂房/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM