简体   繁体   中英

Connect R to Teradata using DBI for use with knitr

I'm trying to use the SQL engine within knitr: http://rmarkdown.rstudio.com/authoring_knitr_engines.html#sql

The documentation states that "to use the knitr SQL engine you first need to establish a DBI connection to a database"

And it provides the following example:

library(DBI)
db <- dbConnect(RSQLite::SQLite(), dbname = "sql.sqlite")

Now I'm trying to connect to Teradata. I've successfully accomplished this in the past using the RODBC and RJDBC packages, and there's quite a bit of useful documentation out there. I have not connected directly via the DBI package, however, nor can I find any examples.

I'm guessing that the syntax is similar to an RJDBC connection (referenced here ):

drv = JDBC("com.teradata.jdbc.TeraDriver","ClasspathForTeradataJDBCDriverFiles") 
conn = dbConnect(drv,"jdbc:teradata://DatabaseServerName/ParameterName=Value","User","Password") 
dbGetQuery(conn,"SQLquery")

But are there any exmaples of a DBI connnection to Teradata?

The solution I posted here works for me with Teradata. I use the odbc package and the Windows ODBC Data Source Administrator. An RMarkdown example below:

```{r}
# Unfortunately, odbc is not on CRAN yet
# So we will need devtools
# install.packages(devtools)
library(devtools)
devtools::install_github("rstats-db/odbc")

# Get connection info from the Windows ODBC Data Source Administrator using the name you set manually. 
# If you don't know what this is, just search in the windows start menu for     "ODBC Data Source Administrator"
con <- dbConnect(odbc::odbc(), 'MyDataWarehouse')
```

```{sql connection = con, output.var = result}
-- This is sql code, comments need to be marked accordingly
SELECT * FROM SOMETABLE LIMIT 200;
```

```{R}
# And the result is available in the next chunk!
result
```

You cannot connect with the DBI package - that just defines an interface (think template) that DBI compliant packages have to implement. What you should be doing is using the RJDBC package - which implements the DBI methods. Don't use RODBC since it does not implement the DBI methods - and I am guessing you need DBI based on your question. To create a connection to Teradata using RJDBC you need to follow a few steps.

You need to the JDBC driver from Teradata here . You will need to be a customer of Teradata to get this.

If you download and look inside the zip or tar file you will find two jar files:

  • terajdbc4.jar
  • tdgssconfig.jar

Extract these files to a known location on your system.

Now you need to install install.packages('RJDBC') package and dependencies.

Since JDBC is Java based make sure you have the latest Java Runtime installed. You can pick this up at java.com - just watch out that you get the same version of Java (32 or 64 bit) as you have for R. Last time I checked the java site did not try and figure this out for you. If you have to install Java make sure you restart your R session.

Now you should be able to load the RJDBC library and create your connection - you will need to know the ip/url of the teradata system you want to connect to, as well as your username and password. You will also need to specify the database you want to connect to:

library(RJDBC)

jars <- c("[path to jar]/terajdbc4.jar", "[path to jar]/tdgssconfig.jar")
drv <- JDBC("com.teradata.jdbc.TeraDriver",jars)
db <- "jdbc:teradata://[ip of teradata system]/TMODE=TERADATA,charset=UTF8"
conn <- dbConnect(drv, db, username, password, database )

With this connection you can now use the implemented DBI methods to query the database - for example dbGetQuery(conn, sql) or dbSendQuery().

Check out the RJDBC docs for more information.

Also if you want to use the SQL code chunk engine in knitr the current build of knitr does not work with the current build of RJDBC. RJDBC does not implement the DBI method dbGetRowCount. As a quickfix I removed the check on the row count / for now you could install the package from github at mattwg/knitr. It will probably fail for some use cases. I will try and get the method implemented by the maintainers of RJDBC.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM