I have downloaded the bike-sharing-dataset from the UCI Machine learning repository and am trying to implement a multivariate linear regression in R. Here is the format of the data:
> head(data1)
season mnth hr holiday weekday workingday weathersit temp atemp hum windspeed cnt
1 1 1 0 0 6 0 1 0.24 0.2879 0.81 0.0000 16
2 1 1 1 0 6 0 1 0.22 0.2727 0.80 0.0000 40
3 1 1 2 0 6 0 1 0.22 0.2727 0.80 0.0000 32
4 1 1 3 0 6 0 1 0.24 0.2879 0.75 0.0000 13
5 1 1 4 0 6 0 1 0.24 0.2879 0.75 0.0000 1
6 1 1 5 0 6 0 2 0.24 0.2576 0.75 0.0896 1
I am trying to normalize specific columns (that have not already been normalized) with the following function:
normalize <- function(x) {
return ((x - min(x)) / (max(x) - min(x)))
}
The problem is that when I run:
dfNorm <- as.data.frame(lapply(data1["season", "mnth", "hr", "weekday", "weathersit"], normalize))
I get the following error:
Error in
[.data.frame
(data1, "season", "month", "hour", "weekday", "weathersit") : unused arguments ("weekday", "weathersit")
Why am I getting this error and how can I fix it?
To modify in-place, I'd use dplyr::mutate
. Something like this should work:
library(dplyr)
dfNorm <- data1 %>%
mutate_at(.vars = vars(season, mnth, hr, weekday, weathersit),
.funs = funs(normalize))
Simply assign the lapply
to new columns:
df[c("season_norm", "mnth_norm", "hr_norm", "weekday_norm", "weathersit_norm")] <-
lapply(df[c("season", "mnth", "hr", "weekday", "weathersit")], normalize)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.