简体   繁体   中英

reading/ accessing/importing a database file into r

I hope everyone is doing well I am trying to read/import a database file to R which have extension of .db but I am not able to do so. I search for the related material but not able to find the answer. the data file is about Wikipedia article contents and is quite huge file. so any help would be very helpful . I tried the this method as well Import .db file into R but got the same error and the proposed answer was difficult to understand for me as I am new to R .

library(project template) 
x<-db.reader("wiki.db,"H:\\wiki.db","wiki.db") 
Error: could not find function "db.reader" 

so as suggested in the above post I applied the answer to run the function as

db.reader <- function(data.file, filename, variable.name)
    {    
  require.package('RSQLite')
  sqlite.driver <- dbDriver("SQLite")

  connection <- dbConnect(sqlite.driver,    
                          dbname = filename)

  tables <- dbListTables(connection)    
  for (table in tables)
  {
    message(paste('  Loading table:', table))
    data.parcel <- dbReadTable(connection,
                               table,
                               row.names = NULL)
    assign(clean.variable.name(table),data.parcel,envir = .TargetEnv)
  }
  disconnect.success <- dbDisconnect(connection)

  if (! disconnect.success)
   {
    warning(paste('Unable to disconnect from database:', filename))
   }  
}

but now I get error as

Loading table: FArevisionContentPlain

Error in  assign(clean.variable.name(table), data.parcel, envir = .TargetEnv)  
  could not find function "clean.variable.name"

any help would be highly appreciated and will be very helpful to me.

I don't have your db file to test this with but if you are having problems with the library the source code is available here . Again I can not test this but based on your comments above you could do the following:

my.db.reader <- function(data.file, filename, variable.name)
{
  require.package('RSQLite')

  sqlite.driver <- dbDriver("SQLite")
  connection <- dbConnect(sqlite.driver,
                          dbname = filename)

  tables <- dbListTables(connection)
  for (table in tables)
  {
    message(paste('  Loading table:', table))

    data.parcel <- dbReadTable(connection,
                               table,
                               row.names = NULL)

    assign(clean.variable.name(table),
           data.parcel,
           envir = .TargetEnv)
  }

  disconnect.success <- dbDisconnect(connection)
  if (! disconnect.success)
  {
    warning(paste('Unable to disconnect from database:', filename))
  }
}
clean.variable.name <- function(variable.name)
{
  variable.name <- gsub('^[^a-zA-Z0-9]+', '', variable.name, perl = TRUE)
  variable.name <- gsub('[^a-zA-Z0-9]+$', '', variable.name, perl = TRUE)
  variable.name <- gsub('_+', '.', variable.name, perl = TRUE)
  variable.name <- gsub('-+', '.', variable.name, perl = TRUE)
  variable.name <- gsub('\\s+', '.', variable.name, perl = TRUE)
  variable.name <- gsub('\\.+', '.', variable.name, perl = TRUE)
  variable.name <- gsub('[\\\\/]+', '.', variable.name, perl = TRUE)
  variable.name <- make.names(variable.name)
  return(variable.name)
}

x<-my.db.reader("wiki.db,"H:\\wiki.db","wiki.db") 

This is simply the source code used to define two functions. The last line is how you can call the function you just created using the parameters you specified in your original question.

clean.variable.name is a function in the ProjectManager package that cleans up the filename of the table in your specified database. For example, it turns a database table named data_table into data.table .

.TargetEnv is a variable within the ProjectManager package that points to .GlobalVariable (see here ). So if you are putting the db.reader in your code manually, this variable will not be found.

To prevent these errors, you can ignore needing to use clean.variable.name , and also specify your own environment variable:

e <- new.env()

custom.db.reader <- function(data.file, filename, variable.name) {
  require.package('RSQLite')
  sqlite.driver <- dbDriver("SQLite")
  connection <- dbConnect(sqlite.driver, dbname = filename)
  tables <- dbListTables(connection)
  for (table in tables) {
    message(paste('  Loading table:', table))
    data.parcel <- dbReadTable(connection, table, row.names = NULL)
    assign(table, data.parcel, envir = e)
  }
  disconnect.success <- dbDisconnect(connection)
  if (! disconnect.success) {
    warning(paste('Unable to disconnect from database:', filename))
  }
}

You can then access the imported database table, if the database table is named data_table , at e$data_table .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM