简体   繁体   中英

How to manage schemas in R & MonetDB?

I've asked a similar question recently, more geared toward dplyr.

See dplyr & monetdb - appropriate syntax for querying schema.table?

As dplyr & MonetDB (according to @Hannes Mühleisen reply above) don't have a proper way to manage schemas, I resolved to use MonetDB.R / DBI native functions.

Also in this case the results have been somewhat problematic: it seems that at this moment MonetDB.R is capable to manage properly only one schema per database.

Let's see some code. I create two schemas, and I create the same table name in each (standard sql practice). I then try to write data into each.

> conn <– dbConnect(MonetDB.R(), "db.url", user= "monetdb", password="monetdb")
> df <- data.frame(i=10,j=20)
> q <- dbGetQuery(conn, "CREATE SCHEMA foo;")
> q <- dbGetQuery(conn, "SET SCHEMA foo;")
> q <- dbGetQuery(conn, "CREATE TABLE mytable (i int, j int);")
> q <- dbWriteTable(conn,  "mytable",df,overwrite=TRUE)
> q <- dbGetQuery(conn, "CREATE SCHEMA bar;")
> q <- dbGetQuery(conn, "SET SCHEMA bar;")
> q <- dbGetQuery(conn, "CREATE TABLE mytable (i int, j int);")

At this stage I should have two tables with identical names but in two distinct schemas. This is confirmed by dbListTables (it would be nice to see the schemas but I can live without it):

> dbListTables(conn)
[1] "mytable" "mytable"

But as soon as I try to write to mytable with an identical query to the one used before, disaster!

> q <- dbWriteTable(conn,  "mytable",df,overwrite=TRUE)
Error in .local(conn, statement, ...) : 
  Unable to execute statement 'INSERT INTO mytable VALUES (10, 20)'.
Server says 'INSERT INTO: no such table 'mytable'' [#42S02].

At this point the DB is left in a messy state and is asking to roll back the transaction.

Let's try with overwrite=FALSE, just to see if there is anything different:

> df <- data.frame(i=10,j=20)
> q <- dbGetQuery(conn, "CREATE SCHEMA foo;")
> q <- dbGetQuery(conn, "SET SCHEMA foo;")
> q <- dbGetQuery(conn, "CREATE TABLE mytable (i int, j int);")
> q <- dbWriteTable(conn,  "mytable",df,overwrite=TRUE)
> q <- dbGetQuery(conn, "CREATE SCHEMA bar;")
> q <- dbGetQuery(conn, "SET SCHEMA bar;")
> q <- dbGetQuery(conn, "CREATE TABLE mytable (i int, j int);")
> dbListTables(conn)
[1] "mytable" "mytable"
> q <- dbWriteTable(conn,  "mytable",df,overwrite=FALSE)
Error in .local(conn, name, value, ...) : 
  Table mytable already exists. Set overwrite=TRUE if you want 
      to remove the existing table. Set append=TRUE if you would like to add the new data to the 
      existing table.

I've hit the mother lode with append=true

> df <- data.frame(i=10,j=20)
> q <- dbGetQuery(conn, "CREATE SCHEMA foo;")
> q <- dbGetQuery(conn, "SET SCHEMA foo;")
> q <- dbGetQuery(conn, "CREATE TABLE mytable (i int, j int);")
> q <- dbWriteTable(conn,  "mytable",df,overwrite=TRUE)
> q <- dbGetQuery(conn, "CREATE SCHEMA bar;")
> q <- dbGetQuery(conn, "SET SCHEMA bar;")
> q <- dbGetQuery(conn, "CREATE TABLE mytable (i int, j int);")
> dbListTables(conn)
[1] "mytable" "mytable"
> q <- dbWriteTable(conn,  "mytable",df,append=TRUE)

The latter sequence seems to work and load the data into the bar schema without problems. But is you use (as I normally do) csvdump=TRUE you get another error:

Error in .local(conn, statement, ...) : 
Unable to execute statement 'COPY 15 RECORDS INTO mytable FROM '/var/folders/m4/yfyjwcpj6rv5nq730bl9vr1h000...'.
Server says 'COPY INTO: no such table 'mytable'' [#42S02].

I could always avoid to use csvdump but I need to write big tables...

I've tried also with copy_to & dplyr , but I get errors as well.

Not surprisingly, seen the problems with csvdump I got similar errors also using monetdb.read.csv : you get error if you try to load a csv when the same name table exists in another schema.

Am I doing something wrong? Are there better way to do things? Is this a bug?

Any help is welcome.

As I needed to find a solution, I ended up re-writing / refactoring all the main standard functions in MonetDB.R.

As a simple example this is the code that substitutes dbListTables :

fdblisttables <- function(conn, schema){ 
  q <- dbGetQuery(conn, paste0("SELECT s.name AS name FROM sys.tables AS s
                               JOIN sys.schemas AS t ON s.schema_id=t.id WHERE t.name='", schema,"';" ))
  if(is.null(q$name)) q <- character() else q <- q$name    
}

To be fair the real issue is that DBI doesn't have a syntax geared for schemas (but it would be good if it had...), so the programming hands of MonetDB.R library developers are somewhat tied.

I am conscious that schemas could be implemented in at least two ways:

  • implicitly, but it would require a way to get from the db what is the current schema , presently not available in monetdb (and possibly neither with other dbs)
  • explicitly, adding the keyword schema to the function call. In the end this is what I did as it was the simplest way. On the other hand it risks to create problems of compatibility with existing code etc.

As usual any suggestion and help is welcome. If there is interest I'll post the functions I created on a gist in github.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM