[英]R transforming data from columns to rows by variable
I am facing a problem with transforming my data frame. 我正面临着转换数据框的问题。 I would like to count how often (once in how many days) does each client buy. 我想算一下每个客户购买的频率(一次是多少天)。 I thout that it would be easiest to transform my data about transactions formated as: 我想最简单的方法是将关于格式化的数据转换为:
Transatcion_ID Client_ID Date
1 1 2017-01-01
2 1 2017-01-04
3 2 2017-02-21
4 1 2017-05-01
5 3 2017-02-04
6 3 2017-03-01
... ... ...
to : 至 :
Client_ID Date_1_purchase Date_2_purchase Date_3_purchase ...
1 2017-01-01 2017-01-04 2017-05-01 ...
2 2017-02-21 NA NA ...
3 2017-02-04 2017-03-01 NA ...
Or: 要么:
Client_ID Date_First_purchase Date_Last_purchase Numberof_orders
1 2017-01-01 2017-05-01 3
2 2017-02-21 2017-02-21 1
3 2017-02-04 2017-03-01 2
I have tried using dcast but I couldnt achive what I wanted. 我尝试过使用dcast,但我无法实现我的想法。 I bet there is a way to do that or eaven calculating what I want without transforming dataset, but i did not find it. 我打赌有一种方法可以做到这一点,或者计算我想要的东西而不转换数据集,但我没有找到它。
We can create a sequence id with rowid
to dcast
from 'long' to 'wide' format 我们可以创建一个带有rowid
的序列id,从“long”到“wide”格式进行dcast
library(data.table)
dcast(setDT(df1), Client_ID ~ paste0("Date_", rowid(Client_ID),
"_purchase"), value.var = "Date")
# Client_ID Date_1_purchase Date_2_purchase Date_3_purchase
#1: 1 2017-01-01 2017-01-04 2017-05-01
#2: 2 2017-02-21 NA NA
#3: 3 2017-02-04 2017-03-01 NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.