I'm running sqldf in R on Ubuntu to select certain IDs from a big table with gigabytes of data and the process is creating temporary etilqs files under /var/tmp according to inotifywait monitoring file changes. However, my /var/tmp is on a small disk and this occasionally causes R to error out. I found a thread on how to change the temp folder location for sqlite on Windows, but I could not figure out how to make it work under Linux.
library(sqldf)
customer_extr <- sqldf("select b.*, a.year, a.name from product as b left join customer as a on a.ID = b.ID", dbname = "/home/userName/customer.db")
It seems to me that sqlite searches directories for temporary file storage locations (NOT the tempfile() that I can choose where to create the file by selecting tmpdir=) in the following order:
I tried a few options but none of them seemed to work:
set temp_store_directory:
con <- dbConnect(dbDriver("SQLite"), dbname = "/home/userName/customer.db") dbGetQuery(con, "PRAGMA temp_store_directory = '/mnt/tmp'")
But this errors out:
Error in rsqlite_send_query(conn@ptr, statement) : basic_string::resize
Currently, temp_store_directory is not set after checking
Sys.getenv('temp_store_directory')
Before running R, I set the environmental variables to the desired temp folder: /mnt/tmp:
export SQLITE_TMPDIR=/mnt/tmp export TMPDIR=/mnt/tmp
I verified this has been successfully set up by
echo $SQLITE_TMPDIR echo $TMPDIR
under Linux,
Sys.getenv('SQLITE_TMPDIR') Sys.getenv('TMPDIR')
in R.
However, my sqldf step still writes etilqs files to /var/tmp.
I tried to run
dbGetQuery(con, "PRAGMA temp_store = 2")
to instruct sqlite to save temporary files in memory. However, it's still writing etilqs files to /var/tmp.
I thought about creating a symbolic link for /var/tmp to point to /mnt/tmp but to do that I think I have to delete the folder /mnt/tmp first. This is not ideal since it's a shared Linux server and the disk for /mnt/tmp sometimes gets unmounted. I am not sure if this will cause any trouble for other applications and users.
I don't know how to check/change the sqlite3_temp_directory global variable in R.
This is my session info:
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] sqldf_0.4-10 RSQLite_1.1 gsubfn_0.6-6 proto_1.0.0
loaded via a namespace (and not attached):
[1] DBI_0.5-1 memoise_1.0.0 Rcpp_0.12.8 digest_0.6.10 chron_2.3-47
I can try upgrading my OS disk to a larger drive but isn't there a way to tell sqlite in R under Linux to write temporary files somewhere else? Any suggestions would be highly appreciated!
You can get R to use a different temporary directory, it respects several settings of environment variables:
edd@max:~$ Rscript -e 'print(tempdir())' # default
[1] "/tmp/RtmpUdPCFL"
edd@max:~$ TMPDIR="." Rscript -e 'print(tempdir())' # overridden
[1] "./RtmpsJk2lP"
edd@max:~$
We will have to see with the sources of the RSQLite and/or sqldf packages to see if they use their own settings, or take it from R. If it is the latter, as I suspect for at least sqldf then you have a way.
But do remember to set TMPDIR (or alike) before you start R.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.