简体   繁体   中英

How to dynamically “mutate” a data frame in R?

suppose I want to log transform columns in a data frame, say iris data and create new ones with the suffix _log dynamically for each desired column.

What I am trying to achieve is:

df$Sepal.Length_log <- log (df$Sepal.Length)
df$Sepal.Width_log <- log (df$Sepal.Width)
df$Petal.Length_log <- log (df$Petal.Length)
df$Sepal.Width_log <- log (df$Sepal.Width)  

but this would be a tedious task when your data have many columns to transform, so I want to achieve this dynamically using a loop and mutate function of the dplyr package, my unsuccessful naive trial was:

library (dplyr)
data(iris)
varLabel <- c('Sepal.Length','Sepal.Width','Petal.Length','Petal.Width')
for (i in 1:length (varLabel)) {
  varNew <- paste (varLabel[i],'log',sep='_')
  iris <- dplyr::mutate (iris,varNew=log (varLabel[i])) # problem arises here
}

I get this error: Error: non-numeric argument to mathematical function

I searched for a solution and the most relevant one seems to be this tutorial on standard and non-standard evaluation, this post and that one also, but I couldn't figure out how to borrow a solution from there. Any help would be much appreciated.

Note:
I want to have both old and new columns in the data set.

A solution with data.table :

library(data.table)
data(iris)
DT <- as.data.table(iris)
varLabel <- c('Sepal.Length','Sepal.Width','Petal.Length','Petal.Width')

NewColumn <- paste0(varLabel, "_log")

DT[, (NewColumn) := lapply(.SD, log), .SDcols = varLabel]

DT
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
#>      Sepal.Length_log Sepal.Width_log Petal.Length_log Petal.Width_log
#>   1:         1.629241       1.2527630        0.3364722      -1.6094379
#>   2:         1.589235       1.0986123        0.3364722      -1.6094379
#>   3:         1.547563       1.1631508        0.2623643      -1.6094379
#>   4:         1.526056       1.1314021        0.4054651      -1.6094379
#>   5:         1.609438       1.2809338        0.3364722      -1.6094379
#>  ---                                                                  
#> 146:         1.902108       1.0986123        1.6486586       0.8329091
#> 147:         1.840550       0.9162907        1.6094379       0.6418539
#> 148:         1.871802       1.0986123        1.6486586       0.6931472
#> 149:         1.824549       1.2237754        1.6863990       0.8329091
#> 150:         1.774952       1.0986123        1.6292405       0.5877867

A short solution with dplyr and mutate_each . Just use a named vector to keep all variables

library(dplyr)
data(iris)
varLabel <- c('Sepal.Length','Sepal.Width','Petal.Length','Petal.Width')
names(varLabel) <- paste0(varLabel,'_log')

res <- iris %>% mutate_each_(funs(log(.)), vars = varLabel)
head(res)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa
#>   Sepal.Length_log Sepal.Width_log Petal.Length_log Petal.Width_log
#> 1         1.629241        1.252763        0.3364722      -1.6094379
#> 2         1.589235        1.098612        0.3364722      -1.6094379
#> 3         1.547563        1.163151        0.2623643      -1.6094379
#> 4         1.526056        1.131402        0.4054651      -1.6094379
#> 5         1.609438        1.280934        0.3364722      -1.6094379
#> 6         1.686399        1.360977        0.5306283      -0.9162907

Try this

logiris<-data.frame(lapply(varLabel,function(x){log(iris[,x])}))
names(logiris)<-paste0("Log-",varLabel)
iris<-cbind(iris,logiris) 

We can use mutate_each

 nm1 <- paste0("varNew_", varLabel)
 res <- iris %>% 
           mutate_each_(funs(log(.)), varLabel) %>% 
           setNames(., c(nm1, setdiff(names(.), varLabel))) %>% 
           bind_cols(iris[intersect(names(iris), varLabel)], .)

head(res,2)
#Source: local data frame [2 x 9]

#  Sepal.Length Sepal.Width Petal.Length Petal.Width varNew_Sepal.Length varNew_Sepal.Width varNew_Petal.Length varNew_Petal.Width Species
#         (dbl)       (dbl)        (dbl)       (dbl)               (dbl)              (dbl)               (dbl)              (dbl)  (fctr)
#1          5.1         3.5          1.4         0.2            1.629241           1.252763           0.3364722          -1.609438  setosa
#2          4.9         3.0          1.4         0.2            1.589235           1.098612           0.3364722          -1.609438  setosa

If the OP is looking for a base R solution, this could also works

iris[nm1] <- log(iris[varLabel])
head(iris,2)
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species varNew_Sepal.Length
#1          5.1         3.5          1.4         0.2  setosa            1.629241
#2          4.9         3.0          1.4         0.2  setosa            1.589235
#  varNew_Sepal.Width varNew_Petal.Length varNew_Petal.Width
#1           1.252763           0.3364722          -1.609438
#2           1.098612           0.3364722          -1.609438

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM