簡體   English   中英

按月 DPLYR 的成員累積計數

[英]Cumulative Count of Members by Month DPLYR

我有一個按注冊月份列出的成員列表,我想做的是創建一個按月列出成員總數的數據框。

原始數據

month.list <- structure(c(18444, 18687, 18475, 18506, 18536, 18567, 18597,18718, 18659, 18628, 18779, 18748), class = "Date") 
total.membership.working <- structure(list(`Mem Account` = c(26137295, 26139796, 26400007,26400455, 26402031, 26402078, 26402239, 1092287142, 1092295228,1092473120), Month = structure(c(18444, 18687, 18444, 18444,18475, 18475, 18444, 18779, 18779, 18779), class = "Date")), row.names = c(NA,-10L), groups = structure(list(`Mem Account` = c(26137295, 26139796,26400007, 26400455, 26402031, 26402078, 26402239, 1092287142,1092295228, 1092473120), .rows = structure(list(1L, 2L, 3L, 4L,5L, 6L, 7L, 8L, 9L, 10L), ptype = integer(0), class = c("vctrs_list_of","vctrs_vctr", "list"))), row.names = c(NA, -10L), class = c("tbl_df","tbl", "data.frame"), .drop = TRUE), class = c("grouped_df","tbl_df", "tbl", "data.frame")) 

我已經寫了一個 for 循環來完成這個,但我希望找到一種沒有循環的整潔方法。

For循環

total.membership <- data.frame()
for(i in 1:length(month.list)) {
  
  foo <- total.membership.working %>%
    ungroup() %>%
    filter(Month <= month.list[i]) %>%
    summarise(Month = max(Month),
      total_membership = n_distinct(`Mem Account`))
  total.membership <- total.membership %>%
    bind_rows(foo)
}

期望輸出

total.membership <- structure(list(Month = structure(c(18444, 18687, 18475, 18506,18536, 18567, 18597, 18628, 18779, 18748), class = "Date"), total_membership = c(45886L,58128L, 47878L, 49214L, 51119L, 53390L, 55200L, 56299L, 60503L,59583L)), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 10L, 11L,12L), class = "data.frame") 

> total.membership
        Month total_membership
1  2020-07-01            45886
2  2021-03-01            58128
3  2020-08-01            47878
4  2020-09-01            49214
5  2020-10-01            51119
6  2020-11-01            53390
7  2020-12-01            55200
8  2021-04-01            58902
9  2021-02-01            57238
10 2021-01-01            56299
11 2021-06-01            60503
12 2021-05-01            59583

試試這個代碼來計算每個月的累積唯一賬戶。

library(dplyr)

total.membership.working %>%
  ungroup %>%
  arrange(Month) %>%
  mutate(cum_n = cumsum(!duplicated(`Mem Account`))) %>%
  group_by(Month) %>% 
  summarise(cum_unique_entries = max(cum_n))

#  Month      cum_unique_entries
#  <date>                  <int>
#1 2020-07-01                  4
#2 2020-08-01                  6
#3 2021-03-01                  7
#4 2021-06-01                 10

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM