[英]How to fill in missing value of a data.frame in R?
I have multiple columns
that has missing values
.我有多个
missing values
columns
。 I want to use the mean
of the same day across all years while filling
the missing
data for each column
.我想使用所有年份的同一天的
mean
,同时filling
每column
的missing
数据。 for example, DF
is my fake data where I see missing
values for the two columns (A & X)
例如,
DF
是我的假数据,其中我看到two columns (A & X)
missing
值
library(lubridate)
library(tidyverse)
library(naniar)
set.seed(123)
DF <- data.frame(Date = seq(as.Date("1985-01-01"), to = as.Date("1987-12-31"), by = "day"),
A = sample(1:10,1095, replace = T), X = sample(5:15,1095, replace = T)) %>%
replace_with_na(replace = list(A = 2, X = 5))
To fill
in Column A
, i use the following code要
fill
Column A
,我使用以下代码
Fill_DF_A <- DF %>%
mutate(Year = year(Date), Month = month(Date), Day = day(Date)) %>%
group_by(Year, Day) %>%
mutate(A = ifelse(is.na(A), mean(A, na.rm=TRUE), A))
I have many columns
in my data.frame
and I would like to generalize this for all the columns
to fill in the missing value?我的
data.frame
有很多columns
,我想将其概括为所有columns
以填充缺失值?
We can use na.aggregate
from zoo
我们可以使用
zoo
na.aggregate
library(dplyr)
library(zoo)
DF %>%
mutate(Year = year(Date), Month = month(Date), Day = day(Date)) %>%
group_by(Year, Day) %>%
mutate(across(A:X, na.aggregate))
Or if we prefer to use conditional statements或者如果我们更喜欢使用条件语句
DF %>%
mutate(Year = year(Date), Month = month(Date), Day = day(Date)) %>%
group_by(Year, Day) %>%
mutate(across(A:X, ~ case_when(is.na(.)
~ mean(., na.rm = TRUE), TRUE ~ as.numeric(.))))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.