简体   繁体   中英

summarise_all, counting n() fails

I have the folloing data frame:

df2 <- 
  structure(list(A = c(4, 5, 3, 3, 4, 4, 4, 5, 5, 4), 
             B = c(4, 5, 4, 4, 4, 4, 3, 5, 5, 4),
             C = c(4, 5, 3, 4, 2, 4, 2, 5, 5, 4),
             D = c(4, 5, 0, 0, 1, 4, 0, 0, 0, 0), 
             E = c(4, 5, 4, 4, 4, 4, 2, 5, 5, 5), 
             F = c(5, 5, 4, 4, 4, 4, 2, 5, 4), 
             G = c(5, 5, 4, 4, 2, 4, 2, 5, 5, 5), 
             H = c(5, 5, 4, 4, 3, 4, 3, 5, 5, 4), 
             K = c(5, 5, 4, 4, 3, 4, 2, 5, 5, 5), 
             L = c(5, 5, 4, 4, 3, 4, 2, 5, 5, 5)), 
        .Names = c("A", "B", "C", "D", "E", "F", "G", "H", "K", "L"), 
        row.names = c(NA, -10L), 
        class = c("tbl_df", "tbl", "data.frame"))

在此处输入图片说明

but somehow "NA" ist not considered when i do:

library(dplyr)
library(tidyr)

df2 %>% gather(Type) %>% group_by(Type) %>% summarise_all(funs(mean(., na.rm = TRUE), sd(., na.rm = TRUE), n(),n1 = sum(!is.na(.)), n2 = sum(is.na(.))))

Result without NAs considered:

在此处输入图片说明

none of "n()", sum(!is.na(.) or sum(is.na(.)) gets the correct result (i know the last two are each others opposite, its just to be sure.

@ANG

Thanks, that does the trick, and also shows where i took a wrong turn. In order to work "better" i developed the query on a small subset, the one i posted in the question. That one has no "natural" NAs, i just took a value out and did not replace ist with NA like ANG suggested.

After running the query on the complete data i get what i needed!

Thanks for pointing out!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM