簡體   English   中英

按 R 中兩列的組求和

[英]Sum by groups in two columns in R

我有以下 DF:

DAY           BRAND     SOLD
2018/04/10     KIA       10
2018/04/15     KIA        5
2018/05/01     KIA        7
2018/05/06     KIA        3
2018/04/04     BMW        2
2018/05/25     BMW        8
2018/06/19     BMW        5
2018/06/14     BMW        1

我想按月對銷售的單位進行求和,並在日期屬於該月的每一行中重復它們(不能在同一個月內為不同品牌計算總和,這是一個條件),如下所示:

DAY           BRAND     SOLD   TOTAL
2018/04/10     KIA       10      15
2018/04/15     KIA        5      15
2018/05/01     KIA        7      10
2018/05/06     KIA        3      10
2018/04/04     BMW        2       2
2018/05/25     BMW        8       8
2018/06/19     BMW        5       6
2018/06/14     BMW        1       6

我怎樣才能做到這一點?

我們可以在從“DAY”列中提取“月份”后使用ave並將其與“BRAND”一起用作分組變量

df1$TOTAL <- with(df1, ave(SOLD, BRAND, 
        format(as.Date(DAY, "%Y/%m/%d"), "%m"), FUN = sum))
df1$TOTAL
#[1] 15 15 10 10  2  8  6  6

或者在dplyr/lubridate

library(dplyr)
library(lubridate)
df1 %>%
   group_by(BRAND, MONTH = month(ymd(DAY))) %>%
   mutate(TOTAL = sum(SOLD))
# A tibble: 8 x 5
# Groups:   BRAND, MONTH [5]
#  DAY        BRAND  SOLD MONTH TOTAL
#  <chr>      <chr> <int> <dbl> <int>
#1 2018/04/10 KIA      10     4    15
#2 2018/04/15 KIA       5     4    15
#3 2018/05/01 KIA       7     5    10
#4 2018/05/06 KIA       3     5    10
#5 2018/04/04 BMW       2     4     2
#6 2018/05/25 BMW       8     5     8
#7 2018/06/19 BMW       5     6     6
#8 2018/06/14 BMW       1     6     6

如果需要,在使用select(-MONTH) ungroup select(-MONTH)后刪除“MONTH”列

數據

df1 <- structure(list(DAY = c("2018/04/10", "2018/04/15", "2018/05/01", 
"2018/05/06", "2018/04/04", "2018/05/25", "2018/06/19", "2018/06/14"
), BRAND = c("KIA", "KIA", "KIA", "KIA", "BMW", "BMW", "BMW", 
"BMW"), SOLD = c(10L, 5L, 7L, 3L, 2L, 8L, 5L, 1L)), 
class = "data.frame", row.names = c(NA, 
-8L))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM