简体   繁体   中英

Using for loops to match pairs of data frames in R

Using a particular function, I wish to merge pairs of data frames, for multiple pairings in an R directory. I am trying to write a 'for loop' that will do this job for me, and while related questions such as Merge several data.frames into one data.frame with a loop are helpful, I am struggling to adapt example loops for this particular use.

My data frames end with either “ _df1.csv” or ' _df2.csv”. Each pair, that I wish to merge into an output data frame, has an identical number at the being of the file name (ie 543_df1.csv and 543_df2.csv).

I have created a character string for each of the two types of file in my directory using the list.files command as below:

df1files <- list.files(path="~/Desktop/combined files” pattern="*_df1.csv", full.names=T, recursive=FALSE)
df2files <- list.files(path="="~/Desktop/combined files ", pattern="*_df2.csv", full.names=T, recursive=FALSE)

The function and commands that I want to apply in order to merge each pair of data frames are as follows:

findRow <- function(dt, df) { min(which(df$datetime > dt )) }
rows <- sapply(df2$datetime, findRow, df=df1)
merged <- cbind(df2, df1[rows,])

I am now trying to incorporate these commands into a for loop starting with something along the following lines, to prevent me from having to manually merge the pairs:

for(i in 1:length(df2files)){ ……

I am not yet a strong R programmer, and have hit a wall, so any help would be greatly appreciated.

My intuition (which I haven't had a chance to check) is that you should be able to do something like the following:

# read in the data as two lists of dataframes:
dfs1 <- lapply(df1files, read.csv)
dfs2 <- lapply(df2files, read.csv)

# define your merge commands as a function
merge2 <- function(df1, df2){
    findRow <- function(dt, df) { min(which(df$datetime > dt )) }
    rows <- sapply(df2$datetime, findRow, df=df1)
    merged <- cbind(df2, df1[rows,])
}

# apply that merge command to the list of lists
mergeddfs <- mapply(merge2, dfs1, dfs2, SIMPLIFY=FALSE)

# write results to files
outfilenames <- gsub("df1","merged",df1files)
mapply(function(x,y) write.csv(x,y), mergeddfs, outfilenames)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM