I have a question regarding the usage of a (Postgre)SQL Database in R: Many documentations on this topic stress the fact that it only makes sense to use SQL Databases in R if you are dealing with big data that doesn't fit in your ram (eg see here and here ). I have a different situation and wasn't to find out if using a Postgre(SQL) database would be a reasonable decision. Here's my situation:
I'm well into a ecological research study where I analyse roe deer gps data at different sampling intervals (5min and 3h) over a span of about 2 years. In addition, I integrate two axis acceleration data at a sampling interval of 4 minutes.
To evaluate the behaviour of the roe deer in regard to humans, I analyse this multidimensional data comparing it to gps data of human beings taken at a sampling interval of 5 seconds.
To date, I've been doing this analysis using dataframe/datatable with dplyr. When merging all the data into one dataset, the resulting datatable becomes really wide . The columns include: Timestamp, ID, X/Y Positions, DOP and so forth of both humans and roe deer and all the resulting calculated values like distance, speed, elevation, proximity and lots more.
Also, the data is immensely long : Since the position of multiple roe deer and multiple humans are recorded simultaneously (many-to-many relationship), which leads to many repetitions in the dataframe. On top of that, the different sampling intervals between humans and roe deer lead to repetition (of the roe deer positions) as well.
I'm hoping that with a database solution, I can
Would you recommend using a database in my case? Would using a database solution help achieve the goals as described above?
Postgresql offers all the protection an ACID Database.
I use both R and Postgresql for work. To be honest I prefer most things to be in the database.
In relation to your many to many data join Database normalization may help you there.
Also a select from postgresql on the relevant columns and applying a filter to the rows may help. More information on select queries can be found here Ref Postgresql select tutorial
EG
Select column1, column3 from example_table where x =y etc and reading this into a data set.
A Database is more suited for handling data while R is more suited to data analysis.
If you want to take a look at the commands calling Postgresql from R you could look at this article from Google.
Ref RPostgresql
Example
``` library(RPostgreSQL)
loads the PostgreSQL driver
drv <- dbDriver("PostgreSQL")
Open a connection
con <- dbConnect(drv, dbname="R_Project")
Submits a statement
rs <- dbSendQuery(con, "select * from R_Users")
All the best
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.