简体   繁体   中英

R apply a vector of functions to a dataframe

I am currently working on a dataframe with raw numeric data in cols. Every col contains data for one parameter (for example gene expression data of gene xyz) while each row contains a subject. Some of the data in the cols are normally distributed, while some are far from it. I ran shapiro tests using apply with margin 2 for different transformations and then picked suitable transformations by comparing shapiro.test()$p.value. I sent my pick as char to a vector, giving me a vector of NA, log10, sqrt with the length of ncol(DataFrame). I now wonder if it is possible to apply the vector to the data frame via an apply-function, or if neccessary a for-loop. How do I do this or is there a better way? I guess I could loop if-else statements but there has to be a more efficient ways because my code already is slow.

Thanks all!

Update: I tried the code below but it is giving me "Error in file(filename, "r") : invalid 'description' argument"

TransformedExampleDF <- apply(exampleDF, 2 , function(x) eval(parse(paste(transformationVector , "(" , x , ")" , sep = "" ))))

exampleDF <- as.data.frame(matrix(c(1,2,3,4,1,10,100,1000,0.1,0.2,0.3,0.4), ncol=3, nrow = 4))

transformationVector <- c(NA, "log10", NA)

So you could do something like this. In the example below, I've cooked up four random functions whose names I've then stored in the list func_list ( Note: the last function converts data to NA ; that is intentional ).

Then, I created another function func_to_df() that accepts the data.frame and the list of functions ( func_list ) as inputs, and applies (ie, executes using get() ) the functions upon the corresponding column of the data.frame . The output is returned (and in this example, is stored in the data.frame my_df1 .

tl;dr: just look at what func_to_df() does. It might also be worthwhile looking into the purrr package (although it hasn't been used here).

#---------------------

#Example function 1
myaddtwo <- function(x){
  if(is.numeric(x)){
    x = x+2
  } else{
    warning("Input must be numeric!")
  }
  return(x)
  #Constraints such as the one shown above
  #can be added elsewhere to prevent
  #inappropriate action
}

#Example function 2
mymulttwo <- function(x){
  return(x*2)
}

#Example function 3
mysqrt <- function(x){
  return(sqrt(x))
}

#Example function 4
myna <- function(x){
  return(NA)
}

#---------------------

#Dummy data
my_df <- data.frame(
  matrix(sample(1:100, 40, replace = TRUE), 
         nrow = 10, ncol = 4), 
  stringsAsFactors = FALSE)

#User somehow ascertains that
#the following order of functions
#is the right one to be applied to the data.frame
my_func_list <- c("myaddtwo", "mymulttwo", "mysqrt", "myna")

#---------------------

#A function which applies
#the functions from func_list
#to the columns of df
func_to_df <- function(df, func_list){
  for(i in 1:length(func_list)){
    df[, i] <- get(func_list[i])(df[, i])
    #Alternative to get()
    #df[, i] <- eval(as.name(func_list[i]))(df[, i])
  }
  return(df)
}

#---------------------

#Execution

my_df1 <- func_to_df(my_df, my_func_list)

#---------------------

#Output
my_df
#    X1 X2 X3 X4
# 1   8 85  6 41
# 2  45  7  8 65
# 3  34 80 16 89
# 4  34 62  9 31
# 5  98 47 51 99
# 6  77 28 40 72
# 7  24  7 41 46
# 8  45 80 75 30
# 9  93 25 39 72
# 10 68 64 87 47

my_df1
#     X1  X2       X3 X4
# 1   10 170 2.449490 NA
# 2   47  14 2.828427 NA
# 3   36 160 4.000000 NA
# 4   36 124 3.000000 NA
# 5  100  94 7.141428 NA
# 6   79  56 6.324555 NA
# 7   26  14 6.403124 NA
# 8   47 160 8.660254 NA
# 9   95  50 6.244998 NA
# 10  70 128 9.327379 NA

#---------------------

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM