简体   繁体   English

使用tbl和src_monetdblite访问数据

[英]Using tbl and src_monetdblite to access data

Sorry if this question has been asked elsewhere, I can't find it. 抱歉,如果在其他地方也提出过这个问题,我找不到。 I'm working through some basic examples in MonetDBLite. 我正在研究MonetDBLite中的一些基本示例

> dbGetQuery(dbcon, "SELECT MAX(mpg) FROM mtcars WHERE cyl = 8")
L3
1 19.2

works, but 可行,但是

> ms <- MonetDBLite::src_monetdblite("./DB")
> t <- tbl(ms, "mtcars")
Error in UseMethod("tbl") : 
no applicable method for 'tbl' applied to an object of class
"c('src_monetdb', 'src_sql', 'src')"

It seems that it's trying to assign the db to t not the table. 似乎正在尝试将db分配给t而不是表。

Any suggestions would be greatly appreciated. 任何建议将不胜感激。

I've been perusing resources and found a useR2016 presentation and noticed a difference here: 我一直在细读资源,找到了useR2016演示文稿,并注意到这里的区别:

> ms
src:  MonetDBEmbeddedConnection
tbls: mtcars

Curious... 好奇...

I'm a huge fan of using MonetDBLite together with dplyr . 我用一个巨大的风扇MonetDBLite连同dplyr My addition to Hannes Mühleisen's (thanks for the package!) answer would be that it appears that the order you load the packages can matter. 我对HannesMühleisen(感谢包裹!)的回答是,看来装载包裹的顺序很重要。 Loading MonetDBLite after dplyr and dbplyr seems to be the key for me. dplyrMonetDBLite之后dplyr dbplyr似乎是我的关键。 Loading MonetDBLite first causes errors similar to the one nzgwynn noted. 首先加载MonetDBLite会导致类似于所提到的nzgwynn的错误。

Sometimes I could connect to the database with no problems. 有时我可以毫无问题地连接到数据库。 Other times I would get error messages like: 其他时候,我会收到如下错误消息:

Error in UseMethod("db_query_fields") : no applicable method for 'db_query_fields' applied to an object of class "MonetDBEmbeddedConnection" UseMethod(“ db_query_fields”)中的错误:没有适用于'db_query_fields'的适用方法应用于类“ MonetDBEmbeddedConnection”的对象

Like nzgwynn, I was puzzled about why it would work sometimes but not others. 像nzgwynn一样,我为为什么有时起作用而不是别人感到困惑。 Restarting and reinstalling wouldn't necessarily fix it for me. 重新启动并重新安装不一定能为我解决。

This clue, from an issue filed about sparklyr , lead me to explore the package loading order: 这个线索来自关于sparklyr一个问题,引导我探索了软件包的加载顺序:

https://github.com/rstudio/sparklyr/issues/38 https://github.com/rstudio/sparklyr/issues/38

Like noted there with sparklyr , and I've noticed with other R database packages, MonetDBLite will load and attach automatically if the Global Environment already contains a connection object. 就像在sparklyr提到的sparklyr ,并且我已经注意到其他R数据库软件包,如果全局环境已经包含连接对象, MonetDBLite将自动加载并附加。 My problem was that I had an src_monetdb object in my workspace, which was causing MonetDBLite to load upon starting RStudio. 我的问题是我的工作区中有一个src_monetdb对象,这导致启动RStudio时加载MonetDBLite So I while I thought I was loading it after dplyr and dbplyr , it was really loading first. 因此,当我以为我是在dplyrdbplyr之后加载它时,它实际上是首先加载的。 If I clear the workspace and then restart, I can load the packages in the preferred order. 如果清除工作空间然后重新启动,则可以按首选顺序加载程序包。 So far, this method has worked. 到目前为止,这种方法已经奏效。

I've seen starting with a clean workspace advised as good practice generally, eg: https://twitter.com/hadleywickham/status/561146907519500288 . 我已经看到建议从一个干净的工作空间开始作为一般的良好做法,例如: https : //twitter.com/hadleywickham/status/561146907519500288 Starting with a fresh workspace loses you no time either given MonetDBLite 's speedy query ability. 有了MonetDBLite的快速查询功能,从全新的工作空间开始就不会浪费您任何时间。

Lastly, I would put a enthusiastic pitch in for using MonetDBLite. 最后,我会热衷于使用MonetDBLite。 I saw it mentioned on RStudio's database page and was immediately impressed on how easy it was to setup and how fast it is. 我在RStudio的数据库页面上看到了它,并立即对它的安装过程和安装速度印象深刻。 It's the best way I've found for working with a ~2 GB dataset in R. When exploring the data interactively, the dplyr queries run so quickly that it feels like I'm working with the data in memory. 这是在R中使用dplyr GB数据集时发现的最好方法。当以交互方式浏览数据时, dplyr查询运行得如此之快,以至于我好像在处理内存中的数据。 And if all I want to do is load the whole dataset into memory, MonetDBLite is as fast or faster than other methods I've tried like read.fst() from the fst package. 如果所有我想要做的是整个数据集加载到内存中, MonetDBLite是一样快或比其他方法我试过像快read.fst()fst包。

You need to call library("dplyr") before using tbl and friends. 使用tbl和朋友之前,您需要先调用library("dplyr") Also make sure you have dbplyr installed. 另外,请确保已安装dbplyr

Update: Also, please make sure there is no connection object ( src ) in a stored workspace loaded at startup. 更新:另外,请确保在启动时加载的存储工作空间中没有连接对象( src )。 Loading connections from .Rdata files does not work! 从.Rdata文件加载连接无效! Instead, create the connection/src from scratch every time you run a script. 而是在每次运行脚本时从头开始创建connection / src。

我关闭R并再次打开它,相同的编码工作正常...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM