简体   繁体   English

当所有值都为 NA 时求和/返回 NA

[英]Sum/return NA when all values are NA

I'm trying to run a function on columns that have NA observations.我正在尝试在具有 NA 观察值的列上运行 function。 When all observations are NA I would like it to return NA, but when only a fraction of rows has it, just apply na.rm=T.当所有观察结果都是 NA 时,我希望它返回 NA,但是当只有一小部分行有它时,只需应用 na.rm=T。 I've seen a few posts showing how to do this ( link_1 , link_2 , link_3 ), but none of them seem to work for my function and I'm not sure where I'm going wrong.我看过一些展示如何执行此操作的帖子( link_1link_2link_3 ),但它们似乎都不适用于我的 function,而且我不确定我哪里出错了。

# data frame
species_1<- c(NA, 10, 40)
species_2<- c(NA, NA, 30)
species_3<- c(NA, NA, NA)
group<- c(1, 1, 1)

df<- data.frame(species_1, species_2, species_3, group)

# function argument
y_true_test<- c(30, 20, 20) 

# function
estimate = function(df, y_true, na.rm=T) {
  
  if (all(is.na(df))) df[NA_integer_] else
  
  sqrt(colSums((t(t(df) - y_true_test))^2, na.rm=T) / 3) / y_true_test * 100
  
}

# run
final<- df %>%
  group_by(group) %>%
  group_modify( ~ as.data.frame.list(estimate(., y_true_test))) #species 3 returns '0' when it should be NA

Any help would be greatly appreciated.任何帮助将不胜感激。

The function was checking the NA on the whole dataset columns instead it should be by each column. function 正在检查整个数据集列的NA而不是它应该按每一列。 Here, is an option with across在这里,是一个带有across的选项

library(dplyr)
names(y_true_test) <- grep("species", names(df), value = TRUE)
df %>%
   group_by(group) %>% 
   summarise(across(everything(), ~ if(all(is.na(.x))) NA_real_ else
     sqrt(sum((.x - y_true_test)^2, na.rm = TRUE)/n())/
                (y_true_test[cur_column()]) * 100), .groups = 'drop')

-output -输出

# A tibble: 1 × 4
  group species_1 species_2 species_3
  <dbl>     <dbl>     <dbl>     <dbl>
1     1      43.0      28.9        NA

If we want to modify the OP's function如果我们要修改OP的function

estimate <- function(df, y_true, narm=TRUE) {
  
  i1 <- colSums(is.na(df)) == nrow(df)
  
  
   out <- sqrt(colSums((t(t(df) - y_true_test))^2,
        na.rm= narm) / 3) / y_true_test * 100
   out[i1] <- NA
   out
  
}

-testing -测试

> df %>%
+   group_by(group) %>%
+   group_modify( ~ as.data.frame.list(estimate(., 
          y_true_test))) 
# A tibble: 1 × 4
# Groups:   group [1]
  group species_1 species_2 species_3
  <dbl>     <dbl>     <dbl>     <dbl>
1     1      43.0      28.9        NA

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM