简体   繁体   English

应用两个数据框

[英]Apply over two data frames

I'm using R, and I have two data.frames, A and B . 我正在使用R,我有两个data.frames, AB They both have 6 rows, but A has 25000 columns (genes), and B has 30 columns. 它们都有6行,但A有25000列(基因), B有30列。 I'd like to apply a function with two arguments f(x,y) where x is every column of A and y is every column of B . 我想应用一个带有两个参数f(x,y)其中xA每一列, yB每一列。 So far it looks like this: 到目前为止它看起来像这样:

i = 1
for (x in A){
    j = 1
    for (y in B){
        out[i,j] <- f(x,y)
        j = j + 1
    }
    i = i + 1
}

I have two issues with this: from my Python programming I associate keeping track of counters like this as crufty, and from my R programming I am nervous of for loops. 我有两个问题:从我的Python编程中,我将跟踪这样的计数器作为关键,并且从我的R编程中我对循环感到紧张。 However, I can't quite see how to apply apply (or even if I should apply apply ) to this problem and was hoping someone might enlighten me. 但是,我不太明白如何申请apply (或者即使我应该申请apply ),也希望有人可以启发我。 I need to treat f() as atomic (it's actually cor.test() ) for now. 我现在需要将f()视为原子(实际上是cor.test() )。

Since you are using data frames, it might be faster to use lapply or sapply to do this (specially given the scope of your data frames). 由于您使用的是数据框,因此使用lapply或sapply执行此操作可能会更快(特别是在数据框范围内)。 For example, 例如,

x <- data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), col3=c(9,10,11,12))
y <- data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8))
bl <- lapply(x, function(u){
   lapply(y, function(v){
       f(u,v) # Function with column from x and column from y as inputs
   })
})
out = matrix(unlist(bl), ncol=ncol(y), byrow=T)

Some data 一些数据

nrows <- 6
A <- data.frame(a = runif(nrows), b = runif(nrows), c = runif(nrows))
B <- data.frame(z = rnorm(nrows), y = rnorm(nrows))

The trick: remember columns with expand.grid 诀窍:使用expand.grid记住列

counter <- expand.grid(seq_along(A), seq_along(B))
f <- function(x) 
{
  cor.test(A[, x["Var1"]], B[, x["Var2"]])$estimate
}

Now we only need 1 call to apply . 现在我们只需要1次通话即可apply

stats <- apply(counter, 1, f)
names(stats) <- paste(names(A)[counter$Var1], names(B)[counter$Var2], sep = ",")
stats

Nesting the applies works, not the easiest syntax, though. 但是,嵌套应用的工作,而不是最简单的语法。

x<-data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), col3=c(9,10,11,12))
y<-data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8))

z<-apply(x,2,function(col,df2)
             {
               apply(df2,2,function(col2,col1)
                           {
                              col2+col1
                           },col)
             },y)

z
 col1 col2 col3
[1,]    2    6   10
[2,]    4    8   12
[3,]    6   10   14
[4,]    8   12   16
[5,]    6   10   14
[6,]    8   12   16
[7,]   10   14   18
[8,]   12   16   20

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM