简体   繁体   English

仅对数据集 (R) 中的某些变量进行标准化的正确语法是什么?

[英]What is the correct syntax to standardize only some of the variables in the dataset (R)?

At first I tried:起初我试过:

Bank_sc <- preProcess(x = Bank,
                      method = c("center", "scale"), 
                      select=c(Age, Experience, Income, Family, CCAvg, Education, Mortgage))

I have omitted one variable here, but it was standardized nonetheless.我在这里省略了一个变量,但它仍然是标准化的。 I cannot find any articles on the proper syntax to do this so please help.我找不到任何关于正确语法的文章,所以请帮忙。

You can use dplyr package您可以使用 dplyr 包

data <- data.frame(x= sample(1:100, 30), y = sample(1:100, 30), z= sample(1:100, 30))

head(data)
   x  y  z
1 26 60 16
2 38 52 51
3 12 25 13
4 32 78 54
5  6 71 59
6 10 83  3

library(dplyr)

data <- data %>% mutate_at(vars(x, y), scale)

head(data)
           x          y  z
1 -0.6630489  0.1550407 16
2 -0.2522096 -0.1088584 51
3 -1.1423613 -0.9995179 13
4 -0.4576293  0.7488137 54
5 -1.3477809  0.5179020 59
6 -1.2108345  0.9137507  3

Just scale the subset.只需scale子集。 No package needed.不需要包。 Example:例子:

to_scale <- c('X2', 'X3')
dat[to_scale] <- scale(dat[to_scale])
dat
#         X1         X2         X3
# 1 19.14806  0.7135746  0.8318253
# 2 19.37075 -1.8396140 -0.9339183
# 3 12.86140  0.3759499 -0.3961600
# 4 18.30448  0.5798603  0.8457129
# 5 16.41746 -0.4692167  0.9450476
# 6 15.19096  0.6394459 -1.2925074

Data:数据:

set.seed(42)
dat <- data.frame(matrix(runif(30, 10, 20), 6, 3))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM