繁体   English   中英

使用 read.table()

[英]Using read.table()

我正在尝试使用 read.table() 在 for 循环中打开一个文件。 当我在 read.table() 中传递路径变量文件时,路径发生变化:目录变量被省略。 我搜索了类似的问题,没有找到相关的案例。
代码:

directories <- list.dirs('directory_path', recursive = T)
for (directory in 1:length(directories)){
  list <- list("File_0", "File_1")
  for(file in 1:length(list)){
    directory = directories[directory]
    file = paste(directory, list[file], sep = '/')
    read.table(file, colClasses = c(rep("character", 2), rep("NULL", 1)),
               header = T)
    output_path <- paste(directory, file, sep = '/')
    write.table(data, output_path, sep = '\t', quote = FALSE)
  }
}

如果我删除 read.table() 命令并改为键入 print(file),则所有路径都会正确打印。

我想打开的文件内容:

name    column_1    column_2
BME_RS00005 878 878
BME_RS00010 257 257
BME_RS00020 2511    2511
BME_RS00025 2611    2611
BME_RS00030 3886    3886
BME_RS17490 1494    1494
BME_RS00035 5922    5922
BME_RS00040 265 265
BME_RS00045 220 220

我应该改变什么?

我从您的代码中推断出您的目录结构如下所示:

├── directory_1
│   ├── File_0
│   └── File_1
├── directory_2
│   ├── File_0
│   └── File_1
├── directory_3
│   ├── File_0
│   └── File_1

最好的办法是在遍历它们之前将所有文件放入一个向量中:

directories <- list.dirs(directory_path, recursive = T)
files  <- c("File_0", "File_1")

full_paths  <- as.character(
    sapply(files, function(x) paste0(directories, "/", x))
)
full_paths

# [1] "directory_1/File_0" "directory_2/File_0" "directory_3/File_0" "directory_1/File_1"
# [5] "directory_2/File_1" "directory_3/File_1"

现在你有了一个文件向量,你可以读入它们。

你可能可以用lapply做下一点,但我不确定你在循环中做了什么。 现在您已经更新了问题,说您要删除一列,只需执行以下操作:


for(infile in full_paths){
    df  <- read.table(
        infile,
        colClasses = c(rep("character", 2), rep("NULL", 1)),
        header = T
    )
    # ... do stuff here
    df[["column_2"]]  <- NULL
    outfile = paste0(infile, "_new")
    write.table(df, outfile, sep = '\t', quote = FALSE)
}

您可以考虑一种没有任何循环的不同方法。 此解决方案应该在“主”目录中的每个目录中获取您想要的所有文件:

# first you get all the directories in the main dir
list_dir <- list.dirs("...\\directory", recursive = T)
# and files you need
list_files <- c('File_0.txt','File_1.txt')

# then you create ALL the combinations of files and directories, in a vector
files_dir <- expand.grid(list_dir, list_files)
files_dir <- paste(files_dir$Var1, files_dir$Var2,sep = '/')

# you lapply a function that if the file in directory exists, it reat it, if not
# it creates an empty element in the list
list_of_file <- lapply(files_dir, function(x) if (file.exists(x)){read.table(x, header = T)}  )

# remove the empty elements
list_of_file <- list_of_file[sapply(list_of_file, is.null)] <- NULL

# last you can do everything you need, for example remove one specific column 
# from each data.frame
lapply(list_of_file, function(x)  { x["column_1"] <- NULL; x })
# or in case you need an index
lapply(list_of_file, function(x)  { x[,3] <- NULL; x })

如果您需要保存它们:

# first you've to give some names, in this case a number
names(list_of_file) <- seq_along(list_of_file)

# then you can save all with a mapply
# to not have printed anything on cosole, wrap it with invisible()
invisible(
mapply( write.table
       ,x = list_of_file
       ,file = paste0("...\\directory\\",names(list_of_file), ".txt")
       )
       )
        

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM