简体   繁体   English

R:从多个文件路径导入数据并存储在列表中

[英]R: Import data from multiple file paths and store in a list

I have some data which I would like to read in from seperate .txt files based on individual paths. 我有一些数据,我想根据各个路径从单独的.txt文件中读取。 A sample folder structure with txt files can be downloaded here . 带有txt文件的示例文件夹结构可以在此处下载。

The (sample) data.frame I have looks like this 我拥有的(样本)data.frame看起来像这样

data <-   structure(list(Name = structure(c(1L, 3L, 4L, 2L), .Label = c("Test1", "Test10", "Test2", "Test6"), class = "factor"), Metadata = structure(c(3L, 4L, 1L, 2L), .Label = c("asdajl7", "asfhas", "sgash", "uashas8"), class = "factor"), Filepath = structure(c(2L, 3L, 4L, 1L), .Label = c("", "Folder1/File8.txt", "Folder7/file2.txt", "Folder9/File19.txt"), class = "factor")), .Names = c("Name", "Metadata", "Filepath"), class = "data.frame", row.names = c(NA, -4L))

data
    Name Metadata           Filepath
1  Test1    sgash  Folder1/File8.txt
2  Test2  uashas8  Folder7/file2.txt
3  Test6  asdajl7 Folder9/File19.txt
4 Test10   asfhas

In order to make a reproducible example I tried to implement the following function to adjust the filepath to the place where you saved the Folder structure from the Download above. 为了提供一个可复制的示例,我尝试实现以下功能以将文件路径调整为您从上面的下载中保存文件夹结构的位置。

# Choose path to unzipped Data directory
choose.dir <- function() {
  system("osascript -e 'tell app \"R\" to POSIX path of (choose folder with prompt \"Choose Data Folder:\")' > /tmp/R_folder",
         intern = FALSE, ignore.stderr = TRUE)
  p <- system("cat /tmp/R_folder && rm -f /tmp/R_folder", intern = TRUE)
  return(ifelse(length(p), p, NA))
}
a <- choose.dir()
if(is.na(a)) stop("No folder", call. = F)
# paste ready to use path together
data$completepath <- paste0(a,"/",data$Filepath)
data$completepath <- gsub("//", "/", data$completepath)

The data.frame now looks like this (I unzipped the folder structure to my Desktop): 现在,data.frame如下所示(我将文件夹结构解压缩到了桌面上):

data
    Name Metadata           Filepath                               completepath
1  Test1    sgash  Folder1/File8.txt  /Users/XYZ/Desktop/Data/Folder1/File8.txt
2  Test2  uashas8  Folder7/file2.txt  /Users/XYZ/Desktop/Data/Folder7/file2.txt
3  Test6  asdajl7 Folder9/File19.txt /Users/XYZ/Desktop/Data/Folder9/File19.txt
4 Test10   asfhas                                      /Users/XYZ/Desktop/Data/

How could I read in the data from the different .txt files using a loop so that I get the following list structure? 如何使用循环从不同的.txt文件中读取数据,以便获得以下列表结构?

   List with 3 elements
        1.1 (5 observations and 5 Variables)
        $Name chr[1:5] Test1 Test1 Test1 Test1 Test1
        $Year num[1:5] 1783 1784 1785 1786 1787
        $data1 num[1:5] 12 53 13.1 12.9 16
        $data2 num[1:5] 56 5 532 27 9
        $data3 num[1:5] 0.1 9 42 2 13
        1.2 (4 observations and 3 variables)
        $Name chr[1:4] Test2 Test2 Test2 Test2
        $Year num[1:4] 1387 1388 1389 1390
        $data num[1:4] 78.9 27 12.3 0.9
        1.3 (3 observations and 3 variables)
        $Name chr[1:3] Test6 Test6 Test6
        $Test1 chr[1:3] hajshf asfhah ashsa
        $Year num[1:3] 2001 2002 2003

What I tried is the following, but this doesn't work as the empty filepath of Test10 is causing problems. 我尝试了以下操作,但是由于Test10的空文件路径导致了问题,因此无法正常工作。 Can someone help me? 有人能帮我吗?

# read in the data
f <- file.path(data$completepath)
d <- lapply(f, read.table)

You don't have to do it this way, you could write the path out instead 您不必这样做,您可以写出路径

setwd("/home/christie/Downloads/Data/")

This will give all of the paths to every file in the working directory 这将提供工作目录中每个文件的所有路径

files<-list.files(getwd(),recursive=TRUE)

This reads them all into the list 这会将它们全部读入列表

d<-lapply(files,function(x) read.table(x, header=T))

str(d)
List of 3
 $ :'data.frame':   5 obs. of  4 variables:
  ..$ data1: num [1:5] 12 53 13.1 12.9 16
  ..$ data2: int [1:5] 56 5 532 27 9
  ..$ data3: num [1:5] 0.1 9 42 2 13
  ..$ year : int [1:5] 1783 1784 1785 1786 1787
 $ :'data.frame':   4 obs. of  2 variables:
  ..$ data: num [1:4] 78.9 27 12.3 0.9
  ..$ year: int [1:4] 1387 1388 1389 1390
 $ :'data.frame':   3 obs. of  3 variables:
  ..$ data1: int [1:3] 18 39 371
  ..$ Test1: Factor w/ 3 levels "asfhah","ashsa",..: 3 1 2
  ..$ Year : int [1:3] 2001 2002 2003

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM