I have several .csv files of data stored in a directory, and I need to import all of them into R.
Each .csv has two columns when imported into R. However, the 1001st row needs to be stored as a separate variable for each of the .csv files (it corresponds to an expected value which was stored here during the simulation; I want it to be outside of the main data).
So far I have the following code to import my .csv files as matrices.
#Load all .csv in directory into list
dataFiles <- list.files(pattern="*.csv")
for(i in dataFiles) {
#read all of the csv files
name <- gsub("-",".",i)
name <- gsub(".csv","",name)
i <- paste(".\\",i,sep="")
assign(name,read.csv(i, header=T))
}
This produces several matrices with the naming convention "sim_data_L_mu" where L and mu are parameters from the simulation. How can I remove the 1001st row (which has a number in the first column, and the second column is null) from each matrix and store it as a variable named "sim_data_L_mu_EV"? The main problem I have is that I do not know how to call all of the newly created matrices in my for loop.
Couldn't post long code in comments so am writing here:
# Use dialog to select folder
# Full names are required to access files that are not in the current working directory
file_list <- list.files(path = choose.dir(), pattern = "*.csv", full.names = T)
big_list <- lapply(file_list, function(z){
df <- read.csv(z)
scalar <- df[1000,1]
return(list(df, scalar))
})
To access the scalar value from the third file, you can use
big_list[[3]][2]
The elements in big_list
follow the order of file_list
so you always know which file the data comes from.
If you use data.table::fread()
instead of read.csv
, you can play around with assigning column names, selecting which rows/columns to read etc. It's also considerably faster for large datafiles.
Hope this helps!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.