[英]Sum/return NA when all values are NA
I'm trying to run a function on columns that have NA observations.我正在尝试在具有 NA 观察值的列上运行 function。 When all observations are NA I would like it to return NA, but when only a fraction of rows has it, just apply na.rm=T.
当所有观察结果都是 NA 时,我希望它返回 NA,但是当只有一小部分行有它时,只需应用 na.rm=T。 I've seen a few posts showing how to do this ( link_1 , link_2 , link_3 ), but none of them seem to work for my function and I'm not sure where I'm going wrong.
我看过一些展示如何执行此操作的帖子( link_1 、 link_2 、 link_3 ),但它们似乎都不适用于我的 function,而且我不确定我哪里出错了。
# data frame
species_1<- c(NA, 10, 40)
species_2<- c(NA, NA, 30)
species_3<- c(NA, NA, NA)
group<- c(1, 1, 1)
df<- data.frame(species_1, species_2, species_3, group)
# function argument
y_true_test<- c(30, 20, 20)
# function
estimate = function(df, y_true, na.rm=T) {
if (all(is.na(df))) df[NA_integer_] else
sqrt(colSums((t(t(df) - y_true_test))^2, na.rm=T) / 3) / y_true_test * 100
}
# run
final<- df %>%
group_by(group) %>%
group_modify( ~ as.data.frame.list(estimate(., y_true_test))) #species 3 returns '0' when it should be NA
Any help would be greatly appreciated.任何帮助将不胜感激。
The function was checking the NA
on the whole dataset columns instead it should be by each column. function 正在检查整个数据集列的
NA
而不是它应该按每一列。 Here, is an option with across
在这里,是一个带有
across
的选项
library(dplyr)
names(y_true_test) <- grep("species", names(df), value = TRUE)
df %>%
group_by(group) %>%
summarise(across(everything(), ~ if(all(is.na(.x))) NA_real_ else
sqrt(sum((.x - y_true_test)^2, na.rm = TRUE)/n())/
(y_true_test[cur_column()]) * 100), .groups = 'drop')
-output -输出
# A tibble: 1 × 4
group species_1 species_2 species_3
<dbl> <dbl> <dbl> <dbl>
1 1 43.0 28.9 NA
If we want to modify the OP's function如果我们要修改OP的function
estimate <- function(df, y_true, narm=TRUE) {
i1 <- colSums(is.na(df)) == nrow(df)
out <- sqrt(colSums((t(t(df) - y_true_test))^2,
na.rm= narm) / 3) / y_true_test * 100
out[i1] <- NA
out
}
-testing -测试
> df %>%
+ group_by(group) %>%
+ group_modify( ~ as.data.frame.list(estimate(.,
y_true_test)))
# A tibble: 1 × 4
# Groups: group [1]
group species_1 species_2 species_3
<dbl> <dbl> <dbl> <dbl>
1 1 43.0 28.9 NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.