简体   繁体   中英

in R, how to apply a function that takes a vector as input, where the vector must be built from multiple columns of a data frame?

I suspect this is an obvious thing to do, but I can't figure out how to do it, or find any information that fits my case in this community or elsewhere.

vs is the data frame containing the variables:

cs <- seq(0,1,0.2)
vs <- expand.grid(cs,cs,cs)

[BTW, already at this stage I have a doubt: can the expand.grid command be written more efficiently, as I have 3 times the same vector cs ? But OK, not the main point.]

fn is an example of the function taking a vector as input (the actual one is much more complicated, and the vector has length 16):

fn <- function(p) {(p[1]+exp(p[2])+p[3]^2)/(sum(exp(p)))}

Now I want to apply fn to vs , basically making the vector to pass to fn from each row of vs .
For 1 row this is obvious:

fn(c(vs[1,1],vs[1,2],vs[1,3]))
[1] -0.3333333

But what if I want to do it automatically for all the rows in vs ?

After consulting the documentation, do.call seemed the obvious choice, and indeed when I used the example function ( paste ), it worked:

head(do.call(paste,vs))
[1] "0 0 0"   "0.2 0 0" "0.4 0 0"
[4] "0.6 0 0" "0.8 0 0" "1 0 0" 

Of course it did not work for fn , but I figured this was due to the fact that paste takes n arguments, whereas my function takes 1 argument.

That's where I got stuck. I thought I could make a new column containing the vectors made from the 3 columns of vs , and then just run fn on it. But I don't know how to do that. I tried apply, do.call with c or with list , but to no avail.

Any suggestions?

Thanks!

We can use apply with MARGIN as 1 to loop over the rows and apply the 'fn'

out1 <- apply(vs, 1, fn)

as there is only a single argument in the function, do.call may not work


Another option is pmap after changing the arguments of the function from 1 to 3

library(purrr)
fn1 <- function(p1, p2, p3) {(p1+exp(p2)+p3^2)/(sum(exp(c(p1, p2, p3))))}
out2 <- pmap_dbl(setNames(vs, c('p1', 'p2', 'p3')), fn1)

Or another option is to create the function with input as dataset with columns extracted

fn2 <- function(dat) {(dat[[1]] + exp(dat[[2]]) + dat[[3]]^2)/rowSums(exp(dat))}
out3 <- fn2(vs)

identical(out1, out2)
#[1] TRUE
identical(out1, out3)
#[1] TRUE

Benchmarks

library(microbenchmark)
microbenchmark(apply = apply(vs, 1, fn), 
       pmap = pmap_dbl(setNames(vs, c('p1', 'p2', 'p3')), fn1), 
       colWise = fn2(vs), 
       unit = 'relative', times = 10L)
 Unit: relative
#   expr      min       lq     mean   median       uq      max neval cld
#   apply 5.148858 5.137562 3.890306 4.237701 2.901886 3.222972    10   c
#    pmap 2.837697 2.752563 2.189115 2.269152 1.963476 1.889337    10  b 
# colWise 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000    10 a  

Regarding writing the expand.grid in a compact way when we want to replicate the 'cs' vector n times

n <- 3
vs <- expand.grid(rep(list(cs), n))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM