編寫函數以使用命名列表在 R 中執行條件匯總

Question

我正在嘗試編寫一個函數，該函數接受一個 tibble 和一個過濾器規范列表，並根據這些過濾器規范執行條件匯總。

# Sample DF with a column to summarize and 2 ID columns.
df <- tibble(
    to_summarize = c(1, 2, 8, 9),
    ID1 = c('A', 'A', 'C', 'A'),
    ID2 = c('X', 'Y', 'Z', 'X')
)

我們可以使用兩個 ID（返回 10）或使用 1 個 ID（返回 12）有條件地匯總。

df %>%
    summarize(
        total1 = sum(to_summarize[ID1 == 'A' & ID2 == 'X']),
        total2 = sum(to_summarize[ID1 == 'A'])
    )

我想在一個函數中允許同樣的靈活性。 用戶應該能夠提供一個過濾器列表或一個空列表（其中匯總函數將在整個列上執行，沒有過濾）。

我想最簡單的方法是使用命名列表，其中每個名稱都是一個要過濾的列，每個值都是過濾該列的值。

filters <- list(
    ID1 = 'A',
    ID2 = 'X'
)

# Here is my attempt at a function to implement this:
summarise_and_filter <- function(df, filters) {
    df %>%
        summarise(
            total = sum(to_summarize[names(filters) == unname(unlist(filters))]))
}

# It does not work, it just returns zero
df %>%
    summarise_and_filter(
        filters = filters
    )

# I imagine the function might need to call map in some way, or perhaps imap?
map_summarise_and_filter <- function(df, filters) {
    df %>%
        summarise(
            total = sum(
                to_summarize[
                    imap_lgl(
                        filters, 
                        ~.y == .x
                    )]
            )
        )
}

# But this also returns zero
df %>%
    map_summarise_and_filter(
        filters = filters
    )

Answer 1

有兩個操作完成，其中一個可以動態計算

library(dplyr)
df %>%
    mutate(total2 = sum(to_summarize[ID1 == filters[['ID1']]])) %>% 
    filter(across(starts_with("ID"), ~ . == 
                filters[[cur_column()]])) %>%
    summarise(total1 = sum(to_summarize),total2 = first(total2))

-輸出

# A tibble: 1 x 2
  total1 total2
   <dbl>  <dbl>
1     10     12

如果我們想在沒有filter情況下執行此操作，則將across輸出reduce到單個邏輯vector到subset

library(purrr)
df %>% 
  summarise(total1 = sum(to_summarize[across(starts_with('ID'), 
   ~ . == filters[[cur_column()]]) %>% 
            reduce(`&`)]), 
     total2 = sum(to_summarize[ID1 == filters[['ID1']]]))

-輸出

# A tibble: 1 x 2
  total1 total2
   <dbl>  <dbl>
1     10     12

編寫函數以使用命名列表在 R 中執行條件匯總

問題描述

1 個解決方案

解決方案1
1 已采納 2021-07-19 18:09:41

編寫函數以使用命名列表在 R 中執行條件匯總

問題描述

1 個解決方案

解決方案1 1 已采納 2021-07-19 18:09:41

解決方案1
1 已采納 2021-07-19 18:09:41