在 purrr:::map_dfr 函数中使用列表名称

Question

I was trying something relatively simple, but having some struggles.我正在尝试一些相对简单的事情，但遇到了一些困难。 Let's say I have two dataframes df1 and df2 :假设我有两个数据df1和df2 ：

df1: df1：

id  expenditure
1    10
2    20
1    30
2    50

df2: df2:

id  expenditure
1    30
2    50
1    60
2    10

I also added them to a named list:我还将它们添加到命名列表中：

table_list = list()
table_list[['a']] = df1
table_list[['b']] = df2

And now I want to perform some summary operation through a function and then bind those rows:现在我想通过一个函数执行一些汇总操作，然后绑定这些行：

get_summary = function(table){
   final_table = table %>% group_by(id) %>% summarise(total_expenditure= sum(expenditure))

}

And then apply this through map_dfr :然后通过map_dfr应用它：

summary = table_list %>% map_dfr(get_summary, id='origin_table')

So, this will create a almost what I'm looking for:所以，这将创建一个几乎我正在寻找的东西：

 origin_table   id   total_expenditure
      a          1       40
      a          2       70
      b          1       90
      b          2       60

But, what if I would like to do something specific depending on the element of the list that is being passed, something like this:但是，如果我想根据正在传递的列表的元素做一些特定的事情，像这样：

get_summary = function(table, name){
   dummy_list = c(TRUE, FALSE)
   names(dummy_list) = c('a', 'b')

   final_table = table %>% group_by(id) %>% summarise(total_expenditure= sum(expenditure))

   is_true = dummy_list[[name]] # Want to use the original name to call another list

   if(is_true) final_table = final_table %>% mutate(total_expenditure = total_expenditure + 1) 

   return(final_table)

}

This would bring something like this:这会带来这样的事情：

 origin_table   id   total_expenditure
      a          1       41
      a          2       71
      b          1       90
      b          2       60

So is there any way to use list names inside my function?那么有什么方法可以在我的函数中使用列表名称吗？ Or any way to identify which element of my list I'm working with?或者有什么方法可以识别我正在使用的列表中的哪个元素？ Maybe map_dfr is too restricted and I have to use something else?也许map_dfr太受限制了，我必须使用其他东西？

Edit: changed example so it is more grounded in reality编辑：更改示例，使其更符合现实

Answer 1

Instead of using map , use imap , which can return the names of the list in .y不要使用map ，而是使用imap ，它可以在.y中返回列表的名称

library(purrr)
library(dplyr)
get_summary = function(dat, name){
   dat %>%
       group_by(id) %>%
        summarise(total_expenditure= sum(expenditure, na.rm = TRUE), 
              .groups = "drop") %>%
        mutate(total_expenditure = if(name=='a')
                total_expenditure + 1 else total_expenditure)

}

-testing -测试

> table_list %>% 
    imap_dfr(~ get_summary(.x, name = .y), .id = 'origin_table')
# A tibble: 4 × 3
  origin_table    id total_expenditure
  <chr>        <int>             <dbl>
1 a                1                41
2 a                2                71
3 b                1                90
4 b                2                60

data数据

table_list <- list(a = structure(list(id = c(1L, 2L, 1L, 2L), 
expenditure = c(10L, 
20L, 30L, 50L)), class = "data.frame", row.names = c(NA, -4L)), 
    b = structure(list(id = c(1L, 2L, 1L, 2L), expenditure = c(30L, 
    50L, 60L, 10L)), class = "data.frame", row.names = c(NA, 
    -4L)))

Answer 2

Managed to do it, by adding origin_table as a pre-existing column on the dataframes:设法做到这一点，通过添加origin_table作为数据框上的预先存在的列：

df1 = df1 %>% mutate(origin_table = 'a')
df2 = df2 %>% mutate(origin_table = 'b')

Then I can extract the origin by doing the following:然后我可以通过执行以下操作来提取原点：

get_summary = function(table){
   dummy_list = c(TRUE, FALSE)
   names(dummy_list) = c('a', 'b')

   origin = table %>% distinct(origin_table) %>% pull

   final_table = table %>% group_by(id) %>% summarise(total_expenditure= sum(expenditure))

   is_true = dummy_list[[origin ]] # Want to use the original name to call another list

   if(is_true) final_table = final_table %>% mutate(total_expenditure = total_expenditure + 1) 

   return(final_table)

}

在 purrr:::map_dfr 函数中使用列表名称

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-05-10 19:33:04

data数据

解决方案2
0 2022-05-10 19:29:32

在 purrr:::map_dfr 函数中使用列表名称

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-05-10 19:33:04

data数据

解决方案2 0 2022-05-10 19:29:32

解决方案1
1 已采纳 2022-05-10 19:33:04

解决方案2
0 2022-05-10 19:29:32