I am trying to replicate a result of dplyr::mutate_at()
using base R. I am fairly new to writing functions myself and I was wondering whether the function I came up with is (a) reasonable and (b) how can I have the cbind()
call inside the function and also keep all variables from the diamonds
dataset.
First the dplyr::mutate_at()
call:
require(tidyverse)
diamonds %>%
mutate_at(.funs = funs(relative = ./price), .vars = c("x", "y", "z"))
# A tibble: 53,940 x 13
#carat cut color clarity depth table price x y z x_relative y_relative z_relative
#<dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43 0.0121 0.0122 0.00745
#2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31 0.0119 0.0118 0.00709
#3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31 0.0124 0.0124 0.00706
#4 0.290 Premium I VS2 62.4 58 334 4.2 4.23 2.63 0.0126 0.0127 0.00787
#5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75 0.0130 0.0130 0.00821
#6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48 0.0117 0.0118 0.00738
#7 0.24 Very Good I VVS1 62.3 57 336 3.95 3.98 2.47 0.0118 0.0118 0.00735
#8 0.26 Very Good H SI1 61.9 55 337 4.07 4.11 2.53 0.0121 0.0122 0.00751
#9 0.22 Fair E VS2 65.1 61 337 3.87 3.78 2.49 0.0115 0.0112 0.00739
#10 0.23 Very Good H VS1 59.4 61 338 4 4.05 2.39 0.0118 0.0120 0.00707
# ... with 53,930 more rows
This is the function I came up with to replicate the result in base R:
rel_fun <- function(x, y){
out <- x / y
colnames(out) <- (paste(colnames(x), "relative", sep = "_"))
out
}
And here the result:
df_out <- rel_fun(diamonds[c("x", "y", "z")], diamonds$price)
df_out2 <- cbind(diamonds, df_out)
head(df_out2)
#carat cut color clarity depth table price x y z x_relative y_relative z_relative
#1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43 0.01211656 0.01220859 0.007453988
#2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31 0.01193252 0.01177914 0.007085890
#3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31 0.01238532 0.01244648 0.007064220
#4 0.29 Premium I VS2 62.4 58 334 4.20 4.23 2.63 0.01257485 0.01266467 0.007874251
#5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75 0.01295522 0.01298507 0.008208955
#6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48 0.01172619 0.01178571 0.007380952
It all works fine, I would say, but as I mentioned above how can I keep all variables of the diamonds
dataset while having cbind()
in the function?
I tried the following but I won't get the other variables of the diamonds
dataset because I didn't add them in the function. I only added the ones I needed for the calculation, ie diamonds[c("x", "y", "z")]
. Is there a way to add something in the function that allows me to keep other variables of the original dataset?
rel_fun <- function(x, y){
out <- x / y
colnames(out) <- (paste(colnames(x), "relative", sep = "_"))
out2 <- cbind(x, out)
out2
}
df_out3 <- rel_fun(diamonds[c("x", "y", "z")], diamonds$price)
head(df_out3)
# x y z x_relative y_relative z_relative
#1 3.95 3.98 2.43 0.01211656 0.01220859 0.007453988
#2 3.89 3.84 2.31 0.01193252 0.01177914 0.007085890
#3 4.05 4.07 2.31 0.01238532 0.01244648 0.007064220
#4 4.20 4.23 2.63 0.01257485 0.01266467 0.007874251
#5 4.34 4.35 2.75 0.01295522 0.01298507 0.008208955
#6 3.94 3.96 2.48 0.01172619 0.01178571 0.007380952
The pipe operator %>%
implicitly passes your data frame diamonds
as the first argument to mutate_at()
. To mimic its behavior, you need to do the same with your function. Because you will be passing the entire data frame to the function, you can also just pass the column names as x
:
rel_fun <- function(.data, x, y){
out <- .data[x] / y
colnames(out) <- (paste(x, "relative", sep = "_"))
out2 <- cbind(.data, out)
out2
}
rel_fun( diamonds, c("x", "y", "z"), diamonds$price ) # Works as desired
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.