简体   繁体   English

如何使用R和dplyr连接来自不同SQL数据库的表?

[英]How to join tables from different SQL databases using R and dplyr?

I'm using dplyr (0.7.0) , dbplyr (1.0.0) , DBI 0.6-1 , and odbc (1.0.1.9000) . 我使用的是dplyr (0.7.0)dbplyr (1.0.0)DBI 0.6-1odbc (1.0.1.9000) I would like to do something like the following: 我想做类似以下的事情:

db1 <- DBI::dbConnect(
  odbc::odbc(),
  Driver = "SQL Server",
  Server = "MyServer",
  Database = "DB1"
)
db2 <- DBI::dbConnect(
  odbc::odbc(),
  Driver = "SQL Server",
  Server = "MyServer",
  Database = "DB2"
)
x <- tbl(db1, "Table1") %>%
  dplyr::left_join(tbl(db2, "Table2"), by = "JoinColumn") 

but I keep getting an error that doesn't really seem to have any substance to it. 但我一直得到一个似乎没有任何实质内容的错误。 When I use show_query it seems like the code is trying to create a SQL query that joins the two tables without taking the separate databases into account. 当我使用show_query ,似乎代码正在尝试创建一个SQL查询,该查询连接两个表而不考虑单独的数据库。 Per the documentation for dplyr::left_join I've also tried: 根据dplyr::left_join的文档,我也尝试过:

x <- tbl(db1, "Table1") %>%
      dplyr::left_join(tbl(db2, "Table2"), by = "JoinColumn", copy = TRUE) 

But there is no change in the output or error message. 但输出或错误消息没有变化。 Is there a different way to join tables from separate databases on the same server? 是否有不同的方法从同一服务器上的不同数据库连接表?

I'm assuming from the code you provided that (a) you're interested in joining the two tbl objects via dplyr 's syntax before you run collect() and pull the results into local memory and that (b) you want to refer directly to the database objects in the call to tbl() . 我假设您提供的代码(a)您有兴趣在运行collect() 之前通过dplyr的语法加入两个tbl对象并将结果拉入本地内存并且(b)您想要引用直接调用tbl()的数据库对象。

These choices are important if you want to leverage dplyr to programmatically build your query logic while simultaneously leveraging the database server to INNER JOIN large volumes of data down to the set that you're interested in. (Or at least that's why I ended up here.) 如果您想利用dplyr以编程方式构建查询逻辑,同时利用数据库服务器将大量数据下载到您感兴趣的集合中,这些选择很重要。(或者至少这就是为什么我在这里结束了。)

The solution I found uses one connection without specifying the database, and spells out the database and schema information using in_schema() (I couldn't find this documented or vignetted anywhere): 我找到的解决方案使用一个连接而不指定数据库,并使用in_schema()数据库和架构信息(我无法在任何地方找到此文档或in_schema()晕):

conn <- DBI::dbConnect(
  odbc::odbc(),
  Driver = "SQL Server",
  Server = "MyServer"
)

x <- tbl(src_dbi(conn),
         in_schema("DB1.dbo", "Table1")) %>%
  dplyr::left_join(tbl(src_dbi(conn),
                       in_schema("DB1.dbo", "Table2")),
                   by = "JoinColumn")

I would use the merge() function to perform the left join the on the tables. 我会使用merge()函数来执行表上的左连接。 It would be something like x <- merge(df1, df2, by = "JoinColumn", all.x = TRUE) . 它将类似于x <- merge(df1, df2, by = "JoinColumn", all.x = TRUE)

I faced the same problem and I wasn't able to solve it with dplyr::left_join. 我遇到了同样的问题,我无法用dplyr :: left_join解决它。

At least I was able to do the job using the following workaround. 至少我能够使用以下解决方法完成这项工作。 I connected to SQL Server without declaring a default database, then I ran the query with sql(). 我连接到SQL Server而没有声明默认数据库,然后我用sql()运行查询。

con <- dbConnect(odbc::odbc(), dsn="DWH" ,  uid="", pwd= "" )

data_db <- tbl( con, sql("SELECT * 
                    FROM DB1..Table1 AS a
                    LEFT JOIN DB2..Table2 AS b ON a.JoinColumn = b.JoinColumn") ) 

data_db %>% ... data_db%>%...

Hope it helps. 希望能帮助到你。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 连接来自不同数据库的2个表 - Join 2 tables from different databases 如何在 Azure SQL (SAAS) 中连接来自不同数据库的表? - How can I join tables from different databases in Azure SQL (SAAS)? 如何在一个简单但完整的c#项目中从两个不同的sql server数据库连接两个表? - How to join two tables from two different sql server databases in a simple but complete c# project? 如何连接 Django 中来自两个不同数据库的两个表? - How to Join two tables from two different databases in Django? 如何在SQL Server中联接来自多个数据库的重复表 - How to join repeated tables from multiple databases in SQL Server 如何使用 SQL 比较两个不同数据库中的表? - How can I compare tables in two different databases using SQL? 如何使用Schemabinding创建包含来自2个不同数据库的表的索引视图 - How to create a indexed view with tables from 2 different databases using Schemabinding 使用MERGE INTO合并来自不同数据库的2个表 - Merging 2 tables from different databases using MERGE INTO 使用Informatica连接位于不同数据库中的多个表 - Join multiple tables located in different databases using Informatica 我可以联接来自不同数据库的两个表吗? - Can I join two tables from different databases?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM