简体   繁体   English

在Linux Centos 6.6上使用R连接到SQL Server

[英]Connecting to SQL server using R on Linux Centos 6.6

Hello I am having difficulty connecting to a SQL server database that a work colleague set up for me. 您好我无法连接到工作同事为我设置的SQL服务器数据库。 I mostly run R/Rstudio on a Linux Centos 6.6 machine, but sometimes on AWS EC2 instances and my local Windows PC. 我主要在Linux Centos 6.6机器上运行R / Rstudio,但有时在AWS EC2实例和我的本地Windows PC上运行。 In the past I have connected to AWS Redshift that was set up by someone else and I was able to establish connections to it using “src_postgres” dplyr function. 在过去,我已连接到由其他人设置的AWS Redshift,并且我能够使用“src_postgres”dplyr函数建立与它的连接。 I would like to make a connection like the “dplyr” option if possible so I can reuse some of the code I developed to work with those tables. 如果可能的话,我想像“dplyr”选项那​​样建立连接,这样我就可以重用我开发的一些代码来处理这些表。 The person who created the SQL server database created a username, password, and host name (***.net). 创建SQL Server数据库的人创建了用户名,密码和主机名(***。net)。 My colleague who uses windows and SAS was able to use his Windows username/password that we use to log into our PCs at work. 我的同事使用Windows和SAS能够使用我们用来登录我们工作的PC的Windows用户名/密码。 Can I use that username/password too since it looks like it is an option for me on Linux too, or do I have to use the specific one that was created for me just for SQL server? 我是否可以使用该用户名/密码,因为它在Linux上看起来对我来说也是一个选项,或者我是否必须使用仅为SQL服务器创建的特定用户名/密码?

I tried to do some research on what the best way to do this and below is how I think I should do this along with some findings. 我试着做一些关于最佳方法的研究,下面是我认为我应该如何做以及一些调查结果。 It looks like best option is to use RSQLServer ( https://github.com/imanuelcostigan/RSQLServer ) but I am open to other suggestions (using RODBC via https://support.rstudio.com/hc/en-us/articles/214510788-Setting-up-R-to-connect-to-SQL-Server- ). 看起来最好的选择是使用RSQLServer( https://github.com/imanuelcostigan/RSQLServer ),但我愿意接受其他建议(使用RODBC通过https://support.rstudio.com/hc/en-us/articles / 214510788-设置-R-to-connect-to-SQL-Server- )。 It looks like I have to download/install some items (eg, drivers, sql.yaml) before I am able to do this. 在我能够执行此操作之前,我似乎必须下载/安装一些项目(例如,驱动程序,sql.yaml)。 First I think I need to install the correct SQL server driver for my Centos system (and later for EC2 instances). 首先,我认为我需要为Centos系统安装正确的SQL服务器驱动程序(以及之后的EC2实例)。 For the Centos system I use most of the time can/should I use the “Red Hat” drivers, since I can't seem to find one for Centos? 对于Centos系统,我大部分时间都可以/应该使用“Red Hat”驱动程序,因为我似乎无法为Centos找到一个? I also wonder if I need to install an “authentication driver” if I want to use my Windows login credentials that I use for work (Do I use this http://jtds.sourceforge.net/ or this https://msdn.microsoft.com/en-us/library/hh568450(v=sql.110).aspx ?). 如果我想使用我用于工作的Windows登录凭据,我还想知道是否需要安装“身份验证驱动程序”(我是否使用此http://jtds.sourceforge.net/或此https:// msdn。 microsoft.com/en-us/library/hh568450(v=sql.110).aspx ?)。 Furthermore, is there a “unixODBC” driver that I need to install as well ( https://msdn.microsoft.com/en-us/library/hh568449(v=sql.110).aspx )? 此外,是否还需要安装“unixODBC”驱动程序( https://msdn.microsoft.com/en-us/library/hh568449(v=sql.110).aspx )? Once I get these drivers installed (any others?) then I need to create a “sql.yaml” file to serve my server details. 一旦我安装了这些驱动程序(其他任何?),我需要创建一个“sql.yaml”文件来提供我的服务器详细信息。 However, I am not sure how to create this file and where it should be located (eg, use Notepad++ or to create file and just place file inside working directory)? 但是,我不确定如何创建此文件以及它应该位于何处(例如,使用Notepad ++或创建文件并将文件放在工作目录中)? It looks like I would create a separate entry within that file for the SQL server that I am using. 看起来我会在该文件中为我正在使用的SQL服务器创建一个单独的条目。 I created a “sql.yml” file that I copied directly from here ( https://github.com/imanuelcostigan/RSQLServer ) and placed in within the working directory. 我创建了一个“sql.yml”文件,我直接从这里复制( https://github.com/imanuelcostigan/RSQLServer )并放在工作目录中。 When I tried to run the example in Rstudio I get the error below. 当我尝试在Rstudio中运行该示例时,我收到以下错误。

aw <- RSQLServer::src_sqlserver("AW", database = "AdventureWorks2012")
Error in rJava::.jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", url,  : 
java.sql.SQLException: Unknown server host name 'AW'.

I also tried to “odbcDriverConnect” R function after trying to install the ODBC connection on that server, but got the following error. 尝试在该服务器上安装ODBC连接后,我还尝试了“odbcDriverConnect”R函数,但是出现了以下错误。

dbConnect(RSQLServer::SQLServer(), server="****",         username="****",password="****", database = "****")
[RODBC] ERROR: state IM002, code 0, message [unixODBC][Driver Manager]Data   source name not found, and no default driver specified

I am not sure if the sql.yaml file is correct or if the drivers are the problem, and I am not sure what do try next… My linux IT skills to do this are limited, but I can follow instructions... :) I was wondering if anyone could provide details on what I need to install and setup to get this working (eg, what to type into command line). 我不确定sql.yaml文件是否正确或者驱动程序是否有问题,我不确定接下来会尝试什么...我的linux IT技能是有限的,但我可以按照说明操作... :)我想知道是否有人可以提供我需要安装和设置的详细信息以使其工作(例如,键入命令行的内容)。 I suspect that I don't have the appropriate drivers installed and I am not sure which to try and what the appropriate commands are (eg, those from jTDS, from Microsoft, etc.?). 我怀疑我没有安装适当的驱动程序,我不确定尝试哪些以及适当的命令是什么(例如,来自jTDS,来自Microsoft等的那些?)。 Thanks in advance for your assistance! 提前感谢你的帮助!

UPDATE UPDATE

Thanks Valentin! 谢谢Valentin! I am able to connect that way on my local windows PC using an ODBC connection, but was not able to get it working with the RSQLServer R library function on windows. 我可以使用ODBC连接在我的本地Windows PC上以这种方式连接,但是无法在Windows上使用RSQLServer R库函数。 I confirmed that I connect using R with windows OS with both using the trusted user option and the username and password set up on SQL server. 我确认使用R与Windows操作系统连接,同时使用可信用户选项和SQL服务器上设置的用户名和密码。 I am also able connect to the database using a JDBC connection with Rstudio server (see below for what worked). 我也可以使用与Rstudio服务器的JDBC连接连接到数据库(参见下面的工作原理)。

drv <- JDBC(
driverClass = "net.sourceforge.jtds.jdbc.Driver",
classPath = "/**** /RSQLServer/java/jtds-1.2.8.jar",
identifier.quote="`")

conn <- dbConnect(drv,
              "jdbc:jtds:sqlserver://****.net/DBTable",
              "userid",
              "password")

My problem is that I can't connect using the Rstudio linux server using an ODBC connection (ODBC (maybe FreeTDS driver?) and/or the RSQLServer R library (maybe need to use jTDS and register it?). I would like to figure out how to use "RSQLServer" R library so I can utilize the dplyr backend connection, so I would like to get that option figured out. 我的问题是我无法使用ODBC连接使用Rstudio linux服务器(ODBC(可能是FreeTDS驱动程序?)和/或RSQLServer R库(可能需要使用jTDS并注册它?)。我想想一下如何使用“RSQLServer”R库,以便我可以利用dplyr后端连接,所以我想得到这个选项。

I created the suggested sql.yaml file with the following information ( https://github.com/imanuelcostigan/RSQLServer ) and placed it within my R working directory. 我使用以下信息( https://github.com/imanuelcostigan/RSQLServer )创建了建议的sql.yaml文件,并将其放在我的R工作目录中。 However, when I try to run follow the example on the RSQLServer github site that attempts to connect to this SQL server dataset ( http://sqlblog.com/blogs/jamie_thomson/archive/2012/03/27/adventureworks2012-now-available-to-all-on-sql-azure.aspx ) I can the following errors: 但是,当我尝试运行时,请按照RSQLServer github站点上的示例尝试连接到此SQL Server数据集( http://sqlblog.com/blogs/jamie_thomson/archive/2012/03/27/adventureworks2012-now-available -to-all-on-sql-azure.aspx )我可以出现以下错误:

#using driver specified above
aw <- dbConnect(drv, "AW", database = 'AdventureWorks2012')
Error in .verify.JDBC.result(jc, "Unable to connect JDBC to ", url) : 
Unable to connect JDBC to AW

#trying to use the connection specifid in the sql.yaml file
aw <- RSQLServer::src_sqlserver("AW", database = "AdventureWorks2012")
Error in rJava::.jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", url,  : 
java.sql.SQLException: Unknown server host name 'AW'.

I think it the jTDS driver is not set up correctly, or I am not doing something correct when creating the sql.yaml file within the R working directory (should it be placed somewhere else?). 我认为jTDS驱动程序设置不正确,或者我在R工作目录中创建sql.yaml文件时没有做正确的事情(它应该放在其他地方吗?)。 Thanks again for any suggestions! 再次感谢您的任何建议!

Kevin! 凯文! I am trying to do the same. 我也想做同样的事情。 Connect Centos 6.5 with RSQLServer library to MS-SQL server. 将Centos 6.5与RSQLServer库连接到MS-SQL服务器。 It work fine with RODBC and FreeTDS driver, but i had no success with RSQLServer. 它适用于RODBC和FreeTDS驱动程序,但我没有成功使用RSQLServer。 Looks like i've succeed the connection (it see my tbls), but fail the "SELECT FROM..." 看起来我已成功连接(它看到我的tbls),但未通过“SELECT FROM ...”

>res <- RSQLServer::src_sqlserver("printDB", database = "printlog")
>res
src:  SQLServer 10.50.1600 [sa@10.87.1.170:1433/printlog]
tbls: log, TEMPlog

> tbl(res, sql("SELECT * FROM TEMPlog"))
Error in rJava::.jnew("com/github/RSQLServer/MSSQLResultPull", rJava::.jcast(res@jr,  :
java.lang.ClassNotFoundException

I do not know, what that error mean. 我不知道,这个错误意味着什么。 So what i have understand, that may help you: 所以我明白了,这可能对你有所帮助:

  1. You should put sql.yaml file in your user home directory. 您应该将sql.yaml文件放在用户主目录中。 Not in R home directory. 不在R主目录中。
  2. Looks like RSQLServer does not work with Centos. 看起来RSQLServer不能与Centos一起使用。 On library page, installation section https://github.com/imanuelcostigan/RSQLServer#installation there is information that it was only tested on Windows and OSX. 在库页面,安装部分https://github.com/imanuelcostigan/RSQLServer#installation上有信息表明它仅在Windows和OSX上进行了测试。 So i have disited to forget about it for a while. 所以我暂时不想忘记它。

If you would find a solution, how to make it work - please post the information here. 如果您找到解决方案,如何使其工作 - 请在此处发布信息。 If you need help with configuring RODBC - i can show my config and share some links. 如果您需要有关配置RODBC的帮助 - 我可以显示我的配置并共享一些链接。 PS sorry for my English. 对不起我的英语。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM