简体   繁体   中英

Using the apply family in R to pass each row of a 2-column dataframe to a function I made?

I have a little dataframe with two columns: fp (False Positive) and fn (false negative), like so:

falsepos <- c(.05, .25, .5)
falseneg <- c(.01, .05, .1)
x_name <- "fp"
y_name <- "fn"

df <- data.frame(falsepos,falseneg)
names(df) <- c(x_name, y_name)

I've also written a little adaptation of Bayes's Rule as a function, like so:

bayesrule <- function(baserate = .03, 
                      fp, 
                      fn) {
    output <- (baserate * (1 - fn)) / ((baserate * (1 - fn)) + ((1 - baserate) * (fp)))

    return(output)
}

fp and fn stand for the same thing they did in df . In this function, bayesrule , I've left a default value for the baserate of .03 . My question is: how can I write some R code - likely using the apply family of functions, I'm guessing, but maybe something else - to pass each each row in df 's value for fp and fn into their corresponding place in the bayesrule function, yielding me three Bayes's Rule calculations (each with the same default baserate of.03)?

I've looked at similar posts in SX and have gotten quite close, but I'm just shy of the mark on this. I've gotten as close as this:

sapply(df,FUN = bayesrule,fn=df$fn, fp=df$fp)

But no closer.

Generally, if the function is not vectorized and is dependent on multiple parameters of length > 1, we can use Map / mapply

unlist(Map(bayesrule, fn = df$fn, fp = df$fp))

Or Vectorize the function and apply the columns

Vectorize(bayesrule)(fn = df$fn, fp = df$fp)
#[1] 0.37979540 0.10516605 0.05273438

Here, the function is already vectorized as the operation showed in the function is a vectorized operation in R (It is also mentioned in the comment -@r2evans comments). So, it can be directly applied

with(df, bayesrule(fp=fp, fn = fn))
#[1] 0.37979540 0.10516605 0.05273438

Or with dplyr

library(dplyr)
df %>%
    mutate(new = bayesrule(fp = fp, fn = fn))

With sapply , it is looping over each column individually

You don't need to do anything row-wise here:

bayesrule(fp=df$fp, fn=df$fn)
# [1] 0.37979540 0.10516605 0.05273438

Since all of the math internally is already ready for R's vectorized efficiencies, you can pass vectors. It will be far more efficient (calling bayesrule once ) than trying to call it once per row.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM