簡體   English   中英

如何在 R 中使用 tidyr group_by function 添加其他列?

[英]How to add additional columns using tidyr group_by function in R?

這個問題是對我的帖子后續回答。

數據

df1 <- structure(list(Date = c("6/24/2020", "6/24/2020", "6/24/2020", 
"6/24/2020", "6/25/2020", "6/25/2020"), Market = c("A", "A", 
"A", "A", "A", "A"), Salesman = c("MF", "RP", "RP", "FR", "MF", 
"MF"), Product = c("Apple", "Apple", "Banana", "Orange", "Apple", 
"Banana"), Quantity = c(20L, 15L, 20L, 20L, 10L, 15L), Price = c(1L, 
1L, 2L, 3L, 1L, 1L), Cost = c(0.5, 0.5, 0.5, 0.5, 0.6, 0.6)), 
class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))

解決方案

library(dplyr) # 1.0.0
library(tidyr)
df1 %>%
    group_by(Date, Market) %>% 
    group_by(Revenue = c(Quantity %*% Price), 
             TotalCost = c(Quantity %*% Cost),
             Product, .add = TRUE) %>% 
    summarise(Sold = sum(Quantity)) %>% 
    pivot_wider(names_from = Product, values_from = Sold)
# A tibble: 2 x 7
# Groups:   Date, Market, Revenue, TotalCost [2]
#  Date      Market Revenue TotalCost Apple Banana Orange
#  <chr>     <chr>    <dbl>     <dbl> <int>  <int>  <int>
#1 6/24/2020 A          135      37.5    35     20     20
#2 6/25/2020 A           25      15      10     15     NA

@akrun 的解決方案效果很好。 現在我想知道如何在現有結果中再添加三列銷售人員銷售的數量,這樣最終的 output 將如下所示:

Date        Market  Revenue Total Cost  Apples Sold Bananas Sold    Oranges Sold    MF  RP  FR
6/24/2020   A       135     37.5        35          20              20              20  35  20
6/25/2020   A       25      15          15          25              NA              25  NA  NA

一種選擇是單獨進行分組操作,因為這些操作是在單獨的列上完成的,然后通過公共列進行連接,即“日期”、“市場”

library(dplyr)
library(tidyr)
out1 <- df1 %>%
           group_by(Date, Market) %>% 
           group_by(Revenue = c(Quantity %*% Price), 
                    TotalCost = c(Quantity %*% Cost),
                     Product, .add = TRUE) %>% 
          summarise(Sold = sum(Quantity)) %>% 
          pivot_wider(names_from = Product, values_from = Sold)
out2 <- df1 %>% 
          group_by(Date, Market, Salesman) %>% 
          summarise(SalesSold = sum(Quantity)) %>% 
          pivot_wider(names_from = Salesman, values_from = SalesSold)

left_join(out1, out2)
# A tibble: 2 x 10
# Groups:   Date, Market, Revenue, TotalCost [2]
#  Date      Market Revenue TotalCost Apple Banana Orange    FR    MF    RP
#  <chr>     <chr>    <dbl>     <dbl> <int>  <int>  <int> <int> <int> <int>
#1 6/24/2020 A          135      37.5    35     20     20    20    20    35
#2 6/25/2020 A           25      15      10     15     NA    NA    25    NA

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM