简体   繁体   中英

R: efficient way to apply a function according to the columns of a dataframe

I feel extremely stupid now but I can't come up with more than a for loop...

I have a data frame with numerical and factorial columns. I simply want the numerical columns to be scaled and the factorial columns to be kept as they are. For example

> set.seed(160)
> df1 <- data.frame(as.data.frame(matrix(rnorm(8), ncol=2)), 
                    V3=factor(c("A", "A", "B", "B")))
> df1
          V1         V2 V3
1  0.6185496 -0.6410203  A
2 -0.8722777  2.6520986  A
3  0.8529240 -1.4156009  B
4  0.3678875 -1.1615607  B

I'd like to get

> df1
          V1         V2 V3
1  0.4901808 -0.2642698  A
2 -1.4493527  1.4780179  A
3  0.7950968 -0.6740765  B
4  0.1640750 -0.5396717  B

with a more efficient command than

for(i in 1:ncol(df1)) {
  if(is.factor(df1[,i])) {df1[,i] <- df1[,i]}
  else{df1[,i] <- scale(df1[,i])}
}

I tried various combinations of lapply(), sapply(), if(), ifelse() but nothing seemed to work ( apply doesn't work because the df gets transformed into a matrix and I lose the factor/numeric structure). Any suggestions?

NB: I am not trying to apply a function based on the values in the columns but based on the type of column.

You can try the following, which is similar to a suggestion in the comments:

df1[sapply(df1, is.numeric)] <- scale(df1[sapply(df1, is.numeric)])
#> df1
#          V1         V2 V3
#1  0.4901808 -0.2642698  A
#2 -1.4493527  1.4780179  A
#3  0.7950968 -0.6740765  B
#4  0.1640750 -0.5396717  B

这应该工作。

df1[] <- sapply(df1, function(i) if(is.numeric(i)) scale(i) else i)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM