[英]R: mapply over objects of two lists and return a list of data frames
我有两个GRange列表,我试图将countOverlaps函数应用于列表的每个组合,并返回如下结果列表:
library(GenomicRanges)
gr1 <- GRanges(seqnames = c("chr1", "chr2"), ranges = IRanges(c(7,13), width = 3), strand = c("+", "-"))
gr2 <- GRanges(seqnames = c("chr1", "chr3"), ranges = IRanges(c(5,13), width = 3), strand = c("+", "-"))
grlA <- GRangesList("a" = gr1, "b" = gr2)
gr1 <- GRanges(seqnames = c("chr1", "chr2"), ranges = IRanges(c(1,13), width = 3), strand = c("+", "-"))
gr2 <- GRanges(seqnames = c("chr1", "chr3"), ranges = IRanges(c(3,13), width = 3), strand = c("+", "-"))
grlB <- GRangesList("c" = gr1, "d" = gr2)
我想在grlA中获得对象“ a”和对象“ b”的列表,其中包含针对grlB的每个值的函数结果:
(列出$ a,$ b和c,d的数据框)
$ c
b
$ d
b
这样可以获取列表的所有组合:
comb_apply <- function(f,..., MoreArgs=list()){
exp <- unname(as.list(expand.grid(...,stringsAsFactors = FALSE)))
do.call(mapply, c(list(FUN=f, SIMPLIFY=FALSE, MoreArgs=MoreArgs), exp))
}
# This function is thanks to Michael Lawrence's help posted in the bioconductor package
t= comb_apply(function(i, j) countOverlaps(grlA[[i]], grlB[[j]]), seq_along(grlA), seq_along(grlB))
names(t)=apply(expand.grid(names(grlA), names(grlB)), 1, paste, collapse="_")
但是然后要获得我想要的内容(数据帧列表),我需要使用grep命令来选择属于grlB的数据帧并将它们保存在单独的列表中,但这确实很慢。
new=list()
for (i in names(grlB)) {
df = as.data.frame(t[grep(i,names(t))])
new[[length(new)+1]] <- df
}
有没有grep的另一种方法我可以做到这一点? 谢谢!
此数据不应采用列表结构,因为它具有可预测且一致的结构。 我将其放入数据框,然后将其成形为大致与所需格式相同的格式。
library(dplyr)
library(tidyr)
t %>%
as.data.frame %>%
mutate(ID = 1:n()) %>%
gather(variable, value, -ID) %>%
separate(variable, c("A", "B")) %>%
spread(ID, value) %>%
group_by(B) %>%
do(result = my_function(.) )
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.