简体   繁体   中英

automating a normal transformation function in R over multiple columns

I have a data frame m with:

>m

id  w   y   z
1   2   5   8
2   18  5   98
3   1   25  5
4   52  25  8
5   5   5   4
6   3   3   5

Below is a general function for normally transforming a variable that I need to apply to columns w,y,z.

y<-qnorm((rank(x,na.last="keep")-0.5)/sum(!is.na(x))

For example, if I wanted to run this function on "column w" to get the output column appended to dataframe "m" then:

m$w_n<-qnorm((rank(m$w,na.last="keep")-0.5)/sum(!is.na(m$w))

Can someone help me automate this to run on multiple columns in data frame m? Ideally, I would want an output data frame with the following columns:

id  w   y   z   w_n  y_n  z_n

Note this is a sample data frame, the one I have is much larger and I have more letter columns to run this function on other than w, y,z. Thanks!

Probably a way to do it in a single step, but what about:

df <- data.frame(id = 1:6, w = sample(50, 6), z = sample(50, 6) )

df
  id  w  z
1  1 39 40
2  2 20 26
3  3 43 11
4  4  4 37
5  5 36 24
6  6 27 14

transCols <- function(x) qnorm((rank(x,na.last="keep")-0.5)/sum(!is.na(x)))
tmpdf <- lapply(df[, -1], transCols)
names(tmpdf) <- paste0(names(tmpdf), "_n")
df_final <- cbind(df, tmpdf)
df_final

df_final
  id  w  z        w_n        z_n
1  1 39 40 -0.2104284 -1.3829941
2  2 20 26  1.3829941  1.3829941
3  3 43 11  0.2104284  0.6744898
4  4  4 37 -1.3829941  0.2104284
5  5 36 24  0.6744898 -0.6744898
6  6 27 14 -0.6744898 -0.2104284

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM