[英]How to compute the NAs with the column mean and then multiply columns of different lengths in R?
[英]How to fill mean for NAs in column by groups in r?
我有一個包含多個 NA 的數據集,我想為每列取平均值並按特定組填充 Nas 我的數據集如下所示
PID Category column1 column2 column3
123 1 54 2.4 NA
324 1 52 NA 21.1
356 1 NA 3.6 25.6
378 2 56 3.2 NA
395 2 NA 3.5 29.9
362 2 45 NA 24.3
789 3 65 12.6 23.8
759 3 66 NA 26.8
762 3 NA NA 27.2
741 3 69 8.5 23.3
我需要想要的 output
PID Category column1 column2 column3
123 1 54 2.4 23.3
324 1 52 3.0 21.1
356 1 53 3.6 25.6
378 2 56 3.2 27.1
395 2 50.5 3.5 29.9
362 2 61.3 3.3 24.3
789 3 65 12.6 23.8
759 3 66 10.5 26.8
762 3 66.6 10.5 27.2
741 3 69 8.5 23.3
謝謝
您可以使用:
library(dplyr)
df %>%
group_by(Category) %>%
mutate(across(starts_with('column'),
~replace(., is.na(.), mean(., na.rm = TRUE)))) %>%
ungroup
# PID Category column1 column2 column3
# <int> <int> <dbl> <dbl> <dbl>
# 1 123 1 54 2.4 23.4
# 2 324 1 52 3 21.1
# 3 356 1 53 3.6 25.6
# 4 378 2 56 3.2 27.1
# 5 395 2 50.5 3.5 29.9
# 6 362 2 45 3.35 24.3
# 7 789 3 65 12.6 23.8
# 8 759 3 66 10.6 26.8
# 9 762 3 66.7 10.6 27.2
#10 741 3 69 8.5 23.3
我們可以使用zoo
中的na.aggregate
,默認情況下,它將NA
替換為相關列的mean
library(dplyr)
library(zoo)
df1 %>%
group_by(Category) %>%
mutate(across(starts_with('column'), na.aggregate)) %>%
ungroup
或者使用group_modify
和na.aggregate
作為@G。 格洛騰迪克在評論中建議
df1 %>%
group_by(Category) %>%
group_modify(na.aggregate) %>%
ungroup
或使用data.table
library(data.table)
nm1 <- grep("^column\\d+$", names(df1), value = TRUE)
setDT(df1)[, (nm1) := na.aggregate(.SD), by = Category, .SDcols = nm1]
或與base R
unsplit(lapply(split(df1, df1$Category), na.aggregate), df1$Category)
另一個data.table
選項
cbind(
setDT(df)[, "PID"],
df[,
lapply(
.SD,
function(x) replace(x, is.na(x), mean(x, na.rm = TRUE))
), Category,
.SDcols = patterns("^column")
]
)
給
PID Category column1 column2 column3
1: 123 1 54.00000 2.40 23.35
2: 324 1 52.00000 3.00 21.10
3: 356 1 53.00000 3.60 25.60
4: 378 2 56.00000 3.20 27.10
5: 395 2 50.50000 3.50 29.90
6: 362 2 45.00000 3.35 24.30
7: 789 3 65.00000 12.60 23.80
8: 759 3 66.00000 10.55 26.80
9: 762 3 66.66667 10.55 27.20
10: 741 3 69.00000 8.50 23.30
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.