![](/img/trans.png)
[英]Count non-NA values by each column in dplyr using summarize_if
[英]Count non-`NA` of several columns by group using summarize and across from dplyr
我想使用summarize
和across
通過我的分組變量dplyr
計算非NA
值的數量。 例如,使用這些數據:
library(tidyverse)
d <- tibble(ID = c(1, 1, 1, 2, 2, 2, 3, 3, 3),
Col1 = c(5, 8, 2, NA, 2, 2, NA, NA, 1),
Col2 = c(NA, 2, 1, NA, NA, NA, 1, NA, NA),
Col3 = c(1, 5, 2, 4, 1, NA, NA, NA, NA))
# A tibble: 9 x 4
ID Col1 Col2 Col3
<dbl> <dbl> <dbl> <dbl>
1 1 5 NA 1
2 1 8 2 5
3 1 2 1 2
4 2 NA NA 4
5 2 2 NA 1
6 2 2 NA NA
7 3 NA 1 NA
8 3 NA NA NA
9 3 1 NA NA
解決方案類似於:
d %>%
group_by(ID) %>%
summarize(across(matches("^Col[1-3]$"),
#function to count non-NA per column per ID
))
結果如下:
# A tibble: 3 x 4
ID Col1 Col2 Col3
<dbl> <dbl> <dbl> <dbl>
1 1 3 2 3
2 2 2 0 2
3 3 1 1 0
我希望這是您正在尋找的:
library(dplyr)
d %>%
group_by(ID) %>%
summarise(across(Col1:Col3, ~ sum(!is.na(.x)), .names = "non-{.col}"))
# A tibble: 3 x 4
ID `non-Col1` `non-Col2` `non-Col3`
<dbl> <int> <int> <int>
1 1 3 2 3
2 2 2 0 2
3 3 1 1 0
或者,如果您想通過共享字符串 select 列,您可以使用以下命令:
d %>%
group_by(ID) %>%
summarise(across(contains("Col"), ~ sum(!is.na(.x)), .names = "non-{.col}"))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.