哪个 DBI function 用于 `create table 之类的语句<tabx>作为 select * 来自<taby> ` 在 R 中？</taby></tabx>

Question

I am using DBI / ROracle .我正在使用DBI / ROracle 。

drv <- dbDriver("Oracle")
conn <- dbConnect(drv, ...)

I need to create a table from a select query in another table (ie a statement like create table <tabX> as select * from <tabY> ).我需要从另一个表中的 select 查询创建一个表（即像create table <tabX> as select * from <tabY>这样的语句）。

There seems to be several functions that can perform this task, eg:似乎有几个函数可以执行此任务，例如：

dbSendQuery(conn, "create table tab1 as select * from bigtable")
# Statement:            create table tab1 as select * from bigtable 
# Rows affected:        28196 
# Row count:            0 
# Select statement:     FALSE 
# Statement completed:  TRUE 
# OCI prefetch:         FALSE 
# Bulk read:            1000 
# Bulk write:           1000

Or:或者：

dbExecute(conn, "create table tab2 as select * from bigtable")
# [1] 28196

Or even:甚至：

tab3 <- dbGetQuery(conn, "select * from bigtable")
dbWriteTable(conn = conn, "TAB3", tab3)
# [1] TRUE

Each method seems to work but I guess there is differences in performance/best pratice.每种方法似乎都有效，但我想在性能/最佳实践方面存在差异。 What is the best/most efficient way to run statements like create table <tabX> as select * from <tabY> ?运行create table <tabX> as select * from <tabY>等语句的最佳/最有效方法是什么？

I did not find any hint in the DBI and ROracle help pages.我没有在 DBI 和 ROracle 帮助页面中找到任何提示。

Answer 1

Up front: use dbExecute for this;预先：为此使用dbExecute ； don't use dbSendQuery , that function suggests the expectation of returned data (though still works).不要使用dbSendQuery ，那 function 建议返回数据的期望（尽管仍然有效）。

dbSendQuery should only be used when you expect data in return; dbSendQuery只应在您期望返回数据时使用； most connections will do just fine even if you mis-use it, but that's the design of it.即使您使用不当，大多数连接也能正常工作，但这就是它的设计。 Instead, use dbSendStatement / dbClearResult or better yet just dbExecute .相反，使用dbSendStatement / dbClearResult或更好但只是dbExecute 。

The following are pairs of perfectly-equivalent pathways:以下是一对完全等效的路径：

To retrieve data:检索数据：
- dat <- dbGetQuery(con, qry)
- res <- dbSendQuery(con, qry); dat <- dbFetch(res); dbClearResult(res)
To send a statement (that does not return data, eg UPDATE or INSERT ):发送语句（不返回数据，例如UPDATE或INSERT ）：
- dbExecute(con, stmt)
- res <- dbSendStatement(con, stmt); dbClearResult(res)
- (sloppy) res <- dbSendQuery(con, stmt); dbClearResult(res) （马虎） res <- dbSendQuery(con, stmt); dbClearResult(res) res <- dbSendQuery(con, stmt); dbClearResult(res) (I think some DBs complain about this method) res <- dbSendQuery(con, stmt); dbClearResult(res) （我想有些数据库会抱怨这个方法）

If you choose dbSend* , one should always call dbClearResult when done with the statement/fetch.如果您选择dbSend* ，则在完成语句/提取时应始终调用dbClearResult 。 (R will often clean up after you, but if something goes wrong here -- and I have hit this a few times over the last few years -- the connection locks up and you must recreate it. This can leave orphan connections on the database as well.) （R 通常会在你之后清理，但如果这里出现问题——我在过去几年中遇到过几次——连接会锁定，你必须重新创建它。这可能会在数据库中留下孤立连接也一样。）

I think most use-cases are a single-query-and-out, meaning dbGetQuery and dbExecute are the easiest to use.我认为大多数用例都是单一查询和输出，这意味着dbGetQuery和dbExecute是最容易使用的。 However, there are times when you may want to repeat a query.但是，有时您可能想要重复查询。 An example from ?dbSendQuery :来自?dbSendQuery的示例：

     # Pass multiple sets of values with dbBind():
     rs <- dbSendQuery(con, "SELECT * FROM mtcars WHERE cyl = ?")
     dbBind(rs, list(6L))
     dbFetch(rs)
     dbBind(rs, list(8L))
     dbFetch(rs)
     dbClearResult(rs)

(I think it's a little hasty in that documentation to dbFetch without capturing the data... I would expect dat <- dbFetch(..) , discarding the return value here seems counter-productive.) （我认为在dbFetch的文档中没有捕获数据有点仓促......我希望dat <- dbFetch(..) ，在这里丢弃返回值似乎适得其反。）

One advantage to doing this multi-step (requiring dbClearResult ) is with more complex queries: database servers in general tend to "compile" or optimize a query based on its execution engine.执行此多步骤（需要dbClearResult ）的一个优点是查询更复杂：数据库服务器通常倾向于根据其执行引擎“编译”或优化查询。 This is not always a very expensive step for the server to execute, and it can pay huge dividends on data retrieval.对于服务器来说，这并不总是一个非常昂贵的步骤来执行，而且它可以为数据检索带来巨大的好处。 The server often caches this optimized query, and when it sees the same query it uses the already-optimized version of the query.服务器经常缓存这个优化的查询，当它看到相同的查询时，它使用查询的已经优化的版本。 This is one case where using parameter-binding can really help, as the query is identical in repeated use and therefore never needs to be re-optimized.这是使用参数绑定真正有用的一种情况，因为查询在重复使用时是相同的，因此永远不需要重新优化。

FYI, parameter-binding can be done repeatedly as shown above using dbBind , or it can be done using dbGetQuery using the params= argument.仅供参考，参数绑定可以如上所示使用dbBind重复完成，也可以使用params=参数使用dbGetQuery完成。 For instance, this equivalent set of expressions will return the same results as above:例如，这组等效的表达式将返回与上面相同的结果：

qry <- "SELECT * FROM mtcars WHERE cyl = ?"
dat6 <- dbGetQuery(con, qry, params = list(6L))
dat8 <- dbGetQuery(con, qry, params = list(8L))

As for dbWriteTable , for me it's mostly a matter of convenience for quick work.至于dbWriteTable ，对我来说主要是为了方便快速工作。 There are times when the DBI/ODBC connection uses the wrong datatype on the server (eg, SQL Server's DATETIME instead of DATETIMEOFFSET ; or NVARCHAR(32) versus varchar(max) ), so if I need something quickly, I'll use dbWriteTable , otherwise I formally define the table with the server datatypes that I know I want, as in dbExecute(con, "create table quux (...)") .有时 DBI/ODBC 连接在服务器上使用错误的数据类型（例如，SQL 服务器的DATETIME而不是DATETIMEOFFSET ；或NVARCHAR(32)与varchar(max) ），所以如果我需要快速的东西，我会使用dbWriteTable ，否则我会使用我知道我想要的服务器数据类型正式定义表，如dbExecute(con, "create table quux (...)") 。 This is by far not a "best practice", it is heavily rooted in preference and convenience.到目前为止，这不是“最佳实践”，它很大程度上植根于偏好和便利性。 For data that is easy (float/integer/string) and the server default datatypes are acceptable, dbWriteTable is perfectly fine.对于简单的数据（浮点数/整数/字符串）并且服务器默认数据类型是可以接受的， dbWriteTable非常适合。 One can also use dbCreateTable (which creates it without uploading data), which allows you to specify the fields with a bit more control.还可以使用dbCreateTable （创建它而不上传数据），它允许您指定具有更多控制权的字段。

哪个 DBI function 用于 `create table 之类的语句<tabx>作为 select * 来自<taby> ` 在 R 中？</taby></tabx>

问题描述

1 个解决方案

解决方案1
1 已采纳 2023-01-24 13:16:15

哪个 DBI function 用于 `create table 之类的语句<tabx>作为 select * 来自<taby> ` 在 R 中？</taby></tabx>

问题描述

1 个解决方案

解决方案1 1 已采纳 2023-01-24 13:16:15

解决方案1
1 已采纳 2023-01-24 13:16:15