I have a very basic problem and can't find a solution, so sorry in advance for the beginner question.
I have a data frame with several ID columns and 30 numerical columns. I want to multiply all values of those 30 columns with the same factor. I want to keep the the rest of the data frame unchanged. I figured that dplyr
and transmute_all
or transmute_at
are my friends, but I can't find a way to express the function Column1:Column30 * factor
. All examples given use simple functions like mean
and that doesn't help me with the expression.
I would use mutate_at
. For example:
library(dplyr)
mtcars %>%
mutate_at(vars(mpg:qsec),
.funs = funs(. * 3))
I'll give a solution with data.table
, the dplyr
version should be close to identical.
library(data.table)
# convert to data.table format to use data.table syntax
setDT(my_df)
# .SD refers to all the columns mentioned in the .SDcols argument
# (all columns by default when this argument is not specified)
# - instead of using backticks around *, you could use quotes: "*"
my_df[ , lapply(.SD, `*`, factor), .SDcols = Column1:Column30]
On some made-up data
set.seed(0123498)
# create fake data
DT = setDT(replicate(8, rnorm(5), simplify = FALSE))
DT
# V1 V2 V3 V4 V5 V6 V7 V8
# 1: -0.2685077 -1.06491111 0.7307661 0.09880937 0.2791274 -0.5589676 1.5320685 0.4730013
# 2: 1.0783236 -0.17810929 -0.2578453 0.95940860 1.0990367 -0.6983235 0.9530062 -1.3800769
# 3: 1.1730611 -0.48828441 -1.6314077 -0.76117268 -0.5753245 -0.7370099 0.3982160 -0.8088035
# 4: 0.2060451 -0.07105785 -1.1878591 -0.83464592 2.1872117 -0.4390479 0.1428239 1.2634280
# 5: 1.6142695 0.46381602 0.5315299 2.34790945 -1.2977851 1.0428450 1.9292390 0.5337248
scalar = 3
DT[ , lapply(.SD, "*", scalar), .SDcols = V4:V6]
# V4 V5 V6
# 1: 0.2964281 0.8373822 -1.676903
# 2: 2.8782258 3.2971101 -2.094970
# 3: -2.2835180 -1.7259734 -2.211030
# 4: -2.5039378 6.5616352 -1.317144
# 5: 7.0437283 -3.8933554 3.128535
May be this will help you, just R base
> set.seed(100)
> df = data.frame(id=rep(1:5), val1=rnorm(5), val2=rnorm(5), val3=rnorm(5))
> df
id val1 val2 val3
1 1 -0.50219235 0.3186301 0.08988614
2 2 0.13153117 -0.5817907 0.09627446
3 3 -0.07891709 0.7145327 -0.20163395
4 4 0.88678481 -0.8252594 0.73984050
5 5 0.11697127 -0.3598621 0.12337950
# Multiply by 2 all columns except id column
> df[, !colnames(df) %in% c("id")] <- df[, !colnames(df) %in% c("id")] * 2
> df
id val1 val2 val3
1 1 -1.0043847 0.6372602 0.1797723
2 2 0.2630623 -1.1635814 0.1925489
3 3 -0.1578342 1.4290654 -0.4032679
4 4 1.7735696 -1.6505189 1.4796810
5 5 0.2339425 -0.7197243 0.2467590
>
If it's all numeric columns you want to multiply, (or if you can easily write a test) I'd use lapply
with an is.numeric
test:
Calling the data frame dd
(and using iris
to demonstrate):
dd = iris
dd[] = lapply(dd, FUN = function(x) if (is.numeric(x)) return(x * 2) else return(x))
This is equivalent to a simple for loop, which also works just fine.
for (i in 1:ncol(dd)) {
if (is.numeric(dd[[i]])) dd[[i]] = dd[[i]] * 2
}
You could just use apply
my_df <- data_frame(//some data)
my_scaled_df <- apply(data_frame, 2, transformation_logic)
For this you can use try:
y <- xx[-(1:2)]*100
this "xx[-(1:2)]" is non numeric columns so you need to exclude these from the calculation.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.