当所有值都为 NA 时求和/返回 NA

Question

I'm trying to run a function on columns that have NA observations.我正在尝试在具有 NA 观察值的列上运行 function。 When all observations are NA I would like it to return NA, but when only a fraction of rows has it, just apply na.rm=T.当所有观察结果都是 NA 时，我希望它返回 NA，但是当只有一小部分行有它时，只需应用 na.rm=T。 I've seen a few posts showing how to do this ( link_1 , link_2 , link_3 ), but none of them seem to work for my function and I'm not sure where I'm going wrong.我看过一些展示如何执行此操作的帖子（ link_1 、 link_2 、 link_3 ），但它们似乎都不适用于我的 function，而且我不确定我哪里出错了。

# data frame
species_1<- c(NA, 10, 40)
species_2<- c(NA, NA, 30)
species_3<- c(NA, NA, NA)
group<- c(1, 1, 1)

df<- data.frame(species_1, species_2, species_3, group)

# function argument
y_true_test<- c(30, 20, 20) 

# function
estimate = function(df, y_true, na.rm=T) {
  
  if (all(is.na(df))) df[NA_integer_] else
  
  sqrt(colSums((t(t(df) - y_true_test))^2, na.rm=T) / 3) / y_true_test * 100
  
}

# run
final<- df %>%
  group_by(group) %>%
  group_modify( ~ as.data.frame.list(estimate(., y_true_test))) #species 3 returns '0' when it should be NA

Any help would be greatly appreciated.任何帮助将不胜感激。

Answer 1

The function was checking the NA on the whole dataset columns instead it should be by each column. function 正在检查整个数据集列的NA而不是它应该按每一列。 Here, is an option with across在这里，是一个带有across的选项

library(dplyr)
names(y_true_test) <- grep("species", names(df), value = TRUE)
df %>%
   group_by(group) %>% 
   summarise(across(everything(), ~ if(all(is.na(.x))) NA_real_ else
     sqrt(sum((.x - y_true_test)^2, na.rm = TRUE)/n())/
                (y_true_test[cur_column()]) * 100), .groups = 'drop')

-output -输出

# A tibble: 1 × 4
  group species_1 species_2 species_3
  <dbl>     <dbl>     <dbl>     <dbl>
1     1      43.0      28.9        NA

If we want to modify the OP's function如果我们要修改OP的function

estimate <- function(df, y_true, narm=TRUE) {
  
  i1 <- colSums(is.na(df)) == nrow(df)
  
  
   out <- sqrt(colSums((t(t(df) - y_true_test))^2,
        na.rm= narm) / 3) / y_true_test * 100
   out[i1] <- NA
   out
  
}

-testing -测试

> df %>%
+   group_by(group) %>%
+   group_modify( ~ as.data.frame.list(estimate(., 
          y_true_test))) 
# A tibble: 1 × 4
# Groups:   group [1]
  group species_1 species_2 species_3
  <dbl>     <dbl>     <dbl>     <dbl>
1     1      43.0      28.9        NA

当所有值都为 NA 时求和/返回 NA

问题描述

1 个解决方案

解决方案1
2 已采纳 2022-04-25 20:30:52

当所有值都为 NA 时求和/返回 NA

问题描述

1 个解决方案

解决方案1 2 已采纳 2022-04-25 20:30:52

解决方案1
2 已采纳 2022-04-25 20:30:52