[英]Count the Number of Changes in a Date Column based on another Attribute
大家好,我在数据验证过程中遇到了一个问题。 对于名称列中的每个唯一变量,我需要在日期列中获得更改次数。 例如:
student.data <- data.frame(student_id = c (1:7),
student_name=c("Rick","Rick","Michelle","Michelle","Rick","Michelle","John"),
mark = c(623.3,515.2,611.0,729.0,843.25,459.4,846.65),
date_of_exam = as.Date(c("2014-01-01","2013-09-23","2014-11-15","2014-05-11", "2014-01-01","2016-04-14","2015-05-12")))
我知道它有点复杂,但结果必须是:
>table
>"Rick"
1
>"Michelle"
2
>"John"
0
在此先感谢您的帮助。
您可以按学生分组并计算不同日期的数量并减去一:
library(dplyr)
student.data %>%
group_by(student_name) %>%
summarise(cnt = n_distinct(date_of_exam) -1)
# A tibble: 3 x 2
student_name cnt
<fct> <dbl>
1 John 0
2 Michelle 2
3 Rick 1
data.table
方式:
library(data.table)
setDT(student.data)
student.data[, .(change = uniqueN(date_of_exam) - 1), student_name]
# student_name change
#1: Rick 1
#2: Michelle 2
#3: John 0
或者在基数 R 中:
aggregate(date_of_exam~student_name,student.data, function(x) length(unique(x)) - 1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.