在 R 中创建函数以应用于多个数据集

Question

I have this code, recommended from a Stackoverflow user that works very well.我有这个代码，从 Stackoverflow 用户推荐，效果很好。 I have several datasets that I wish to apply this code to.我有几个数据集，我希望将此代码应用于。 Would I have to continuously apply each dataset to the code, or is there something else that I can do?我是否必须不断地将每个数据集应用于代码，或者我还能做些什么？ (Like store it in some sort of function?) （喜欢将它存储在某种功能中？）

I have datsets我有数据集

df1, df2, df3, df4. I do not wish to rbind these datasets.

Dput for each dataset:每个数据集的 Dput：

structure(list(Date = structure(1:6, .Label = c("1/2/2020 5:00:00 PM", 
"1/2/2020 5:30:01 PM", "1/2/2020 6:00:00 PM", "1/5/2020 7:00:01 AM", 
"1/6/2020 8:00:00 AM", "1/6/2020 9:00:00 AM"), class = "factor"), 
Duration = c(20L, 30L, 10L, 5L, 2L, 8L)), class = "data.frame", row.names = c(NA, 
-6L))

CODE:代码：

df %>%
group_by(Date = as.Date(dmy_hms(Date))) %>% 
summarise(Total_Duration = sum(Duration), Count = n())

This is what I have been doing for each:(etc)这就是我一直在为每个人做的事情：（等）

df1 %>%
group_by(Date = as.Date(dmy_hms(Date))) %>% 
summarise(Total_Duration = sum(Duration), Count = n())


df2 %>%
group_by(Date = as.Date(dmy_hms(Date))) %>% 
summarise(Total_Duration = sum(Duration), Count = n())


df3 %>%
group_by(Date = as.Date(dmy_hms(Date))) %>% 
summarise(Total_Duration = sum(Duration), Count = n())

Is there a way to:有没有办法：

 Store_code<-
 df %>%
 group_by(Date = as.Date(dmy_hms(Date))) %>% 
 summarise(Total_Duration = sum(Duration), Count = n())

and then apply each dataset easily to this code?然后轻松地将每个数据集应用于此代码？

df1(Store_code)
df2(Store_code)

Any suggestion is appreciated.任何建议表示赞赏。

Answer 1

We can use mget to return all the objects into a list , use map to loop over the list and apply the function我们可以使用mget将所有对象返回到一个list ，使用map循环遍历list并应用该函数

library(dplyr)
library(lubridate)
library(purrr)
f1 <- function(dat) {
      dat %>%
        group_by(Date = as.Date(dmy_hms(Date))) %>% 
         summarise(Total_Duration = sum(Duration), Count = n())
      }

lst1 <- map(mget(ls(pattern = "^df\\d+$")), f1)

Here, we assume the column names are the same ie 'Date', 'Duration' in all the datasets.在这里，我们假设所有数据集中的列名称都相同，即“日期”、“持续时间”。 If it is a different one, then can pass as another argument to function如果是不同的，则可以作为另一个参数传递给函数

f2 <- function(dat, datecol, durationcol) {
      dat %>%
        group_by(Date = as.Date(dmy_hms({{datecol}}))) %>% 
         summarise(Total_Duration = sum({{durationcol}}), Count = n())
      }

and apply the function as并将函数应用为

f2(df1, Date, Duration)

Or in the loop或者在循环中

lst1 <- map(mget(ls(pattern = "^df\\d+$")), f2, 
         datecol = Date, durationcol = Duration)

在 R 中创建函数以应用于多个数据集

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-03-30 22:51:51

在 R 中创建函数以应用于多个数据集

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-03-30 22:51:51

解决方案1
1 已采纳 2020-03-30 22:51:51