[英]How to call several variables in a for loop in R?
I have several .csv files of data stored in a directory, and I need to import all of them into R. 我在目录中存储了几个.csv数据文件,我需要将所有这些文件导入R。
Each .csv has two columns when imported into R. However, the 1001st row needs to be stored as a separate variable for each of the .csv files (it corresponds to an expected value which was stored here during the simulation; I want it to be outside of the main data). 每个.csv文件在导入到R中时都有两列。但是,第1001行需要作为每个.csv文件的单独变量存储(它对应于模拟过程中存储在此处的期望值;我希望它不在主数据之内)。
So far I have the following code to import my .csv files as matrices. 到目前为止,我有以下代码将.csv文件导入为矩阵。
#Load all .csv in directory into list
dataFiles <- list.files(pattern="*.csv")
for(i in dataFiles) {
#read all of the csv files
name <- gsub("-",".",i)
name <- gsub(".csv","",name)
i <- paste(".\\",i,sep="")
assign(name,read.csv(i, header=T))
}
This produces several matrices with the naming convention "sim_data_L_mu" where L and mu are parameters from the simulation. 这将产生几个命名约定为“ sim_data_L_mu”的矩阵,其中L和mu是来自仿真的参数。 How can I remove the 1001st row (which has a number in the first column, and the second column is null) from each matrix and store it as a variable named "sim_data_L_mu_EV"? 如何从每个矩阵中删除第1001行(第一列中有一个数字,第二列为空)并将其存储为名为“ sim_data_L_mu_EV”的变量? The main problem I have is that I do not know how to call all of the newly created matrices in my for loop. 我的主要问题是我不知道如何在我的for循环中调用所有新创建的矩阵。
Couldn't post long code in comments so am writing here: 无法在评论中发布长代码,因此请在此处编写:
# Use dialog to select folder
# Full names are required to access files that are not in the current working directory
file_list <- list.files(path = choose.dir(), pattern = "*.csv", full.names = T)
big_list <- lapply(file_list, function(z){
df <- read.csv(z)
scalar <- df[1000,1]
return(list(df, scalar))
})
To access the scalar value from the third file, you can use 要从第三个文件访问标量值,可以使用
big_list[[3]][2]
The elements in big_list
follow the order of file_list
so you always know which file the data comes from. big_list
的元素遵循file_list
的顺序,因此您始终知道数据来自哪个文件。
If you use data.table::fread()
instead of read.csv
, you can play around with assigning column names, selecting which rows/columns to read etc. It's also considerably faster for large datafiles. 如果使用data.table::fread()
而不是read.csv
,则可以read.csv
分配列名,选择要读取的行/列等。对于大型数据文件,这也要快得多。
Hope this helps! 希望这可以帮助!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.