使用 tidyverse 對多個列應用相同的操作匯總

Question

我正在嘗試創建一個匯總表，該表為我提供按年份排序的 17 個問題的肯定回答比例。 我只是不知道如何在不對其進行硬編碼的情況下輕松地將匯總操作應用於多個列。

不幸的是，我不能使用 summarise_at 或 summarise_all 函數，因為我正在使用 dataframe。 我正在考慮編寫一個 function，遍歷列，並將匯總列綁定在一起，但是匯總列名有點奇怪，不能是類型字符。 你有什么建議嗎？

這是我目前擁有的：

s2 <- db %>%
  group_by(Year)%>%
  summarize(Q1=round(sum(Q1d, na.rm=TRUE)*100/length(which(!is.na(Q1d))),1),
            Q2=round(sum(Q2d, na.rm=TRUE)*100/length(which(!is.na(Q2d))),1),
            Q3=round(sum(Q3d, na.rm=TRUE)*100/length(which(!is.na(Q3d))),1),
            Q4=round(sum(Q4d, na.rm=TRUE)*100/length(which(!is.na(Q4d))),1),
            Q5=round(sum(Q5d, na.rm=TRUE)*100/length(which(!is.na(Q5d))),1),
            Q6=round(sum(Q6d, na.rm=TRUE)*100/length(which(!is.na(Q6d))),1),
            Q7=round(sum(Q7d, na.rm=TRUE)*100/length(which(!is.na(Q7d))),1),
            Q8=round(sum(Q8d, na.rm=TRUE)*100/length(which(!is.na(Q8d))),1),
            Q9=round(sum(Q9d, na.rm=TRUE)*100/length(which(!is.na(Q9d))),1),
            Q10=round(sum(Q10d, na.rm=TRUE)*100/length(which(!is.na(Q10d))),1),
            Q11=round(sum(Q11d, na.rm=TRUE)*100/length(which(!is.na(Q11d))),1),
            Q12=round(sum(Q12d, na.rm=TRUE)*100/length(which(!is.na(Q12d))),1),
            Q13=round(sum(Q13d, na.rm=TRUE)*100/length(which(!is.na(Q13d))),1),
            Q14=round(sum(Q14d, na.rm=TRUE)*100/length(which(!is.na(Q14d))),1),
            Q15=round(sum(Q15d, na.rm=TRUE)*100/length(which(!is.na(Q15d))),1),
            Q16=round(sum(Q16d, na.rm=TRUE)*100/length(which(!is.na(Q16d))),1),
            Q17=round(sum(Q17d, na.rm=TRUE)*100/length(which(!is.na(Q17d))),1),
            )

注意：Q1d, Q2d... 是列的名稱

Answer 1

我們可以across dplyr中使用

library(dplyr)
library(stringr)
db %>%
    group_by(Year) %>%
    summarise(across(matches('^Q\\d+d$'), ~ 
              sum(., na.rm = TRUE) * 100 /sum(!is.na(.))), 
         .groups = 'drop') %>%
    rename_with(~ str_remove(., 'd$'), -Year)

或使用collapse

library(collapse)
f1 <- function(x) sum(x, na.rm = TRUE) * 100/sum(!is.na(x))
collap(db, ~ Year, FUN = f1)
#   Year      Q1d Q2d
#1 2010 250.0000 350
#2 2015 293.3333 320

數據

db <- data.frame(Year = c(2010, 2010, 2015, 2015, 2015, 2015),
   Q1d = c(2.5, NA, 3, 3.5, NA, 2.3), Q2d = c(NA, 3.5, NA, 2, 4.6, 3))

使用 tidyverse 對多個列應用相同的操作匯總

問題描述

1 個解決方案

解決方案1
0 已采納 2021-03-18 21:30:15

數據

使用 tidyverse 對多個列應用相同的操作匯總

問題描述

1 個解決方案

解決方案1 0 已采納 2021-03-18 21:30:15

數據

解決方案1
0 已采納 2021-03-18 21:30:15