簡體   English   中英

dplyr 分組並使用來自多列的條件

[英]dplyr grouping and using a conditional from multiple columns

我有一個這樣的數據框

  transactionId user_id total_in_pennies created_at               X  yearmonth
1        345068       8             9900 2018-09-13    New Customer 2018-09-01
2        346189       8             9900 2018-09-20 Repeat Customer 2018-09-01
3        363500       8             7700 2018-10-11 Repeat Customer 2018-10-01
4        376089       8             7700 2018-10-25 Repeat Customer 2018-10-01
5        198450      11                0 2018-01-18    New Customer 2018-01-01
6        203966      11                0 2018-01-25 Repeat Customer 2018-01-01

它有更多的行,但可以使用這個小片段。

我正在嘗試使用 dplyr 進行分組,這樣我就可以獲得這樣的最終數據框

在此處輸入圖片說明

我用這個代碼

df_RFM11 <- data2 %>% group_by(yearmonth) %>% 
  summarise(New_Customers=sum(X=="New Customer"), Repeat_Customers=sum(X=="Repeat Customer"), New_Customers_sales=sum(total_in_pennies & X=="New Customers"), Repeat_Customers_sales=sum(total_in_pennies & X=="Repeat Customers"))

我得到了這個結果

> head(df_RFM11)
# A tibble: 6 x 5
  yearmonth  New_Customers Repeat_Customers New_Customers_sales Repeat_Customers_sales
  <date>             <int>            <int>               <int>                  <int>
1 2018-01-01          4880             2428                   0                      0
2 2018-02-01          2027            12068                   0                      0
3 2018-03-01          1902            15296                   0                      0
4 2018-04-01          1921            13363                   0                      0
5 2018-05-01          2631            18336                   0                      0
6 2018-06-01          2339            14492                   0                      0

我能夠獲得我需要的前 2 列,即新客戶和回頭客的數量,但是當我嘗試獲得新客戶和回頭客的“total_in_pennys”的總和時,我得到了 0

對我做錯了什么有幫助嗎?

您需要將它們放在括號中,如下所示:

df_RFM11 <- data2 %>% 
  group_by(yearmonth) %>% 
  summarise(New_Customers=sum(X=="New Customer"),
            Repeat_Customers=sum(X=="Repeat Customer"),
            New_Customers_sales=sum(total_in_pennies[X=="New Customer"]),
            Repeat_Customers_sales=sum(total_in_pennies[X=="Repeat Customer"])
            )

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM