根據字符向量從數據框中創建動態的列數

Question

我正在嘗試給定一列或更多列數據，這些列應該構成總和。

例如：

set.seed(3550)
# Creates data frame
month <- seq.Date(from = as.Date("2012-09-01"), by = "month", length.out = 50)
a <- rpois(50, 5000)
b <- rpois(50, 3000)
c <- rpois(50, 500)
d <- rpois(50, 1000)

df <- data.frame(month, a, b, c, d)
# Creates list of vectors
mylist <- list(this = "this", that = "that", other = "other")
mylist$this <- c("a")
mylist$that <- c("a", "b")
mylist$other <- c("a", "c", "d")

我可以使用以下代碼獲得所需的結果：

my_df <- df %>%
  group_by(month) %>%
  summarize(this = sum(!!!rlang::syms(mylist$this), na.rm = TRUE),
            that = sum(!!!rlang::syms(mylist$that), na.rm = TRUE),
            other = sum(!!!rlang::syms(mylist$other), na.rm = TRUE))

輸出為：

# A tibble: 50 x 4
        month  this  that other
       <date> <int> <int> <int>
 1 2012-09-01  4958  7858  6480
 2 2012-10-01  4969  7915  6497
 3 2012-11-01  5012  7978  6483
 4 2012-12-01  4982  7881  6460
 5 2013-01-01  4838  7880  6346
 6 2013-02-01  5090  8089  6589
 7 2013-03-01  5013  8044  6582
 8 2013-04-01  4947  7942  6388
 9 2013-05-01  5065  8124  6506
10 2013-06-01  5020  8086  6521
# ... with 40 more rows

我在嘗試找出如何動態創建匯總列數時遇到問題。 我認為在summary調用中循環可能有效，但沒有成功。

combine_iterations <- function(x, iter_list){
  a <- rlang::syms(names(iter_list))
  b <- x %>%
    group_by(month) %>%
    summarize(for (i in 1:length(a)){
      a[[i]] = sum(!!!rlang::syms(iter_list[i]), na.rm = TRUE)
    })
}

輸出：

Error in lapply(.x, .f, ...) : object 'i' not found
Called from: lapply(.x, .f, ...)

Answer 1

您使它有些復雜； 如果要自定義摘要，則可以使用group_by %>% do避免出現rlang引用/ rlang引用問題：

combine_iterations <- function(x, iter_list){
    x %>%
      group_by(month) %>%
      do(
          as.data.frame(lapply(iter_list, function(cols) sum(.[cols])))
      )
}

combine_iterations(df, mylist)
# A tibble: 50 x 4
# Groups:   month [50]
#        month  this  that other
#       <date> <int> <int> <int>
# 1 2012-09-01  5144  8186  6683
# 2 2012-10-01  5134  8090  6640
# 3 2012-11-01  4949  7917  6453
# 4 2012-12-01  5040  8203  6539
# 5 2013-01-01  4971  7938  6474
# 6 2013-02-01  5050  7924  6541
# 7 2013-03-01  5018  8022  6579
# 8 2013-04-01  4945  7987  6476
# 9 2013-05-01  5134  8114  6590
#10 2013-06-01  4984  8011  6476
# ... with 40 more rows

identical(
    df %>% 
        group_by(month) %>% 
        summarise(this = sum(a), that = sum(a, b), other = sum(a, c, d)),

    ungroup(combine_iterations(df, mylist))
)
# [1] TRUE

或者另一個在do使用purrr::map_df創建數據框的選項：

combine_iterations <- function(x, iter_list){
    x %>%
      group_by(month) %>%
      do({
          g = .
          map_df(iter_list, ~ sum(g[.x]))
      })
}

根據字符向量從數據框中創建動態的列數

問題描述

1 個解決方案

解決方案1
2 已采納 2017-09-15 19:48:39

根據字符向量從數據框中創建動態的列數

問題描述

1 個解決方案

解決方案1 2 已采納 2017-09-15 19:48:39

解決方案1
2 已采納 2017-09-15 19:48:39