I have a data frame similar to the following, containing both NA and NaN values.
myinput <- data.frame("Date" = c("20010331", "20010331", "20010331", "20010630", "20010630"), "A" = c(3, NA, 5, NaN, 2), "B" = c(4, NA, 7, NaN, 8), "C" = c(6, NA, 5, NaN, 7), "D" = c(1, NA, 3, NaN, 8))
I would like to replace Na and NaN values with the column mean, and loop this over all columns. Also, I would like to separate this on date. For example, NA in column A would be the average of all column A values with date 20010331, and NaN in column A would be the average of all column A values with date 20010630.
Is there any way to do this? Any help is very much appreciated. Thank you.
With dplyr
:
myinput %>%
group_by(Date) %>%
mutate_at(vars(-group_cols()),~ifelse(is.na(.) | is.nan(.),
mean(.,na.rm=TRUE),.))
# A tibble: 5 x 5
# Groups: Date [2]
Date A B C D
<fct> <dbl> <dbl> <dbl> <dbl>
1 20010331 3 4 6 1
2 20010331 4 5.5 5.5 2
3 20010331 5 7 5 3
4 20010630 2 8 7 8
5 20010630 2 8 7 8
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.