Replace NA and NaN with column mean across multiple columns

Question

I have a data frame similar to the following, containing both NA and NaN values.

myinput <- data.frame("Date" = c("20010331", "20010331", "20010331", "20010630", "20010630"), "A" = c(3, NA, 5, NaN, 2), "B" = c(4, NA, 7, NaN, 8), "C" = c(6, NA, 5, NaN, 7), "D" = c(1, NA, 3, NaN, 8))

I would like to replace Na and NaN values with the column mean, and loop this over all columns. Also, I would like to separate this on date. For example, NA in column A would be the average of all column A values with date 20010331, and NaN in column A would be the average of all column A values with date 20010630.

Is there any way to do this? Any help is very much appreciated. Thank you.

Answer 1

With dplyr :

myinput %>% 
   group_by(Date) %>% 
   mutate_at(vars(-group_cols()),~ifelse(is.na(.) | is.nan(.),
                                         mean(.,na.rm=TRUE),.))
# A tibble: 5 x 5
# Groups:   Date [2]
  Date         A     B     C     D
  <fct>    <dbl> <dbl> <dbl> <dbl>
1 20010331     3   4     6       1
2 20010331     4   5.5   5.5     2
3 20010331     5   7     5       3
4 20010630     2   8     7       8
5 20010630     2   8     7       8

Answer 2

由于您可以使用data.table获得相同的结果，您可以在此处查看如何执行此操作。

Replace NA and NaN with column mean across multiple columns

Question

2 answers

solution1
1 ACCPTED 2020-04-02 14:10:12

solution2
0 2020-04-02 14:15:19

Replace NA and NaN with column mean across multiple columns

Question

2 answers

solution1 1 ACCPTED 2020-04-02 14:10:12

solution2 0 2020-04-02 14:15:19

solution1
1 ACCPTED 2020-04-02 14:10:12

solution2
0 2020-04-02 14:15:19