简体   繁体   English

列表和矩阵使用sapply

[英]Lists and matrix using sapply

I have a perhaps basic questions and I have searched on the web. 我有一个基本的问题,我在网上搜索过。 I have a problem reading files. 我在阅读文件时遇到问题。 Though, I managed to get to read my files, following @Konrad suggestions, which I appreciate: How to get R to read in files from multiple subdirectories under one large directory? 虽然,我设法阅读我的文件,遵循@Konrad的建议,我很欣赏:如何让R从一个大目录下的多个子目录中读取文件?

It is a similar problem, however, I have not resolved it. 这是一个类似的问题,但是,我还没有解决它。

My problem: 我的问题:

I have large number of files of with same name ("tempo.out") in different folders. 我在不同的文件夹中有大量具有相同名称(“tempo.out”)的文件。 This tempo.out has 5 columns/headers. 这个tempo.out有5列/标题。 And they are all the same format with 1048 lines and 5 columns: 它们都是1048行和5列的相同格式:

id XY time temp id XY时间温度

setwd("~/Documents/ewat")
dat.files  <- list.files(path="./ress",
                 recursive=T,
                 pattern="tempo.out"
                 ,full.names=T)
readDatFile <- function(f) {
dat.fl <- read.table(f)  
 }

data.filesf <- sapply(dat.files, readDatFile)                         

# I might not have the right sintax in sub5:
subs5 <- sapply(data.filesf,`[`,5) 
matr5 <- do.call(rbind, subs5)   

probs <- c(0.05,0.1,0.16,0.25,0.5,0.75,0.84,0.90,0.95,0.99)
q <- rowQuantiles(matr5, probs=probs)
print(q)

I want to extract the fifth column (temp) of each of those thousands of files and make calculations such as quantiles. 我想提取这些数千个文件中的每一个的第五列(temp)并进行分位数等计算。

I tried first to read all subfiles in "ress" 我首先尝试阅读“ress”中的所有子文件

The latter gave no error, but my main problem is the "data.filesf" is not a matrix but list, and actually the 5th column is not what I expected. 后者没有给出任何错误,但我的主要问题是“data.filesf”不是矩阵而是列表,实际上第5列并不是我所期望的。 Then the following: 然后是以下内容:

matr5 <- do.call(rbind, subs5)

is also not giving the required values/results. 也没有给出所需的值/结果。

What could be the best way to get columns into what will become a huge matrix? 什么可能是让列成为巨大矩阵的最佳方法?

尝试lapply(data.filef, [ ,, 5)希望这会有所帮助

Consider extending your defined function, readDatFile , to extract fifth column, temp , and assign directly to matrix with sapply or vapply (since you know ahead the needed structure -numeric matrix length equal to nrows or 1048). 考虑扩展您定义的函数readDatFile ,以提取第五列, temp ,并使用sapplyvapply直接分配给矩阵(因为您vapply知道所需的结构 - 数字矩阵长度等于nrows或1048)。 Then, run needed rowQuantiles : 然后,运行所需的rowQuantiles

setwd("~/Documents/ewat")

dat.files  <- list.files(path="./ress",
                         recursive=T,
                         pattern="tempo.out",
                         full.names=T)

readDatFile <- function(f) read.table(f)$temp  # OR USE read.csv(f)[[5]]

matr5 <- sapply(dat.files, readDatFile, USE.NAMES=FALSE)                         
# matr5 <- vapply(dat.files, readDatFile, numeric(1048), USE.NAMES=FALSE)

probs <- c(0.05,0.1,0.16,0.25,0.5,0.75,0.84,0.90,0.95,0.99)
q <- rowQuantiles(matr5, probs=probs)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM