[英]R : Creation of a new column in dataset
I have a dataset of transactions with such variables: you can dowload it here: https://yadi.sk/d/BIXivmVJ34Akbn 我有一个带有此类变量的交易数据集:您可以在此处下载它: https : //yadi.sk/d/BIXivmVJ34Akbn
it's a little different, though, instead if id there is customer id 不过有些不同,如果id是客户ID
id, mmc_code - code of transaction, tr_datetime, tr_type — type of transaction, amount, term_id — terminal id, gender. id,mmc_code-交易代码,tr_datetime,tr_type-交易类型,金额,term_id-终端ID,性别。
I would like to create a new column, trans_count, which is number of transactions a day per person(id). 我想创建一个新列trans_count,它是每人每天(id)的交易次数。 How can I do that?
我怎样才能做到这一点? thanks a lot.
非常感谢。
I separated date and time here. 我在这里分开了日期和时间。
trans_test<-read_csv("~/shared/minor3_2017/3-SecondYear-ML/hw_data/transactions_train.csv")
trans_train <- separate (trans_train, col=tr_datetime, into=c("day", "time"), sep=" ")
trans_train$day<-as.integer(trans_train$day)
dput(head(trans_train))
OUTPUT 输出值
structure(list(day = c(0L, 0L, 0L, 0L, 0L, 0L), time = c("03:16:05",
"11:36:09", "11:37:11", "12:20:45", "12:36:57", "13:53:33"),
mcc_code = c(6011L, 5499L, 5411L, 5912L, 5499L, 4814L), tr_type = c(2010L,
1010L, 1010L, 1010L, 1010L, 1030L), amount = c(-950, -13.5,
-271.43, -134, -544, -100), term_id = c(NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_
), id = c(1726L, 1726L, 1726L, 1726L, 1726L, 1726L)), .Names = c("day",
"time", "mcc_code", "tr_type", "amount", "term_id", "id"), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
I don't know of a clean way to add the column in the way you describe. 我不知道以您描述的方式添加列的干净方法。 However, if you want to create a new summary table you could use this:
但是,如果要创建新的摘要表,则可以使用以下方法:
library(dplyr)
trans_train %>%
group_by(day, id) %>%
summarize(transactions_per_day_per_costumer = n())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.