[英]Count number of unique observations last 6 months (for every observation)
How would you solve the following problem in R / tidyverse?:您将如何解决 R / tidyverse 中的以下问题?:
sample data:样本数据:
tibble(
date = seq(as.Date(paste0("2010-01-",runif(1,1,25))), by = "month", length.out = 24),
machine_ID = sample(letters[1:10],size = 24,replace = T),
machine_cat = rep(c(1,2),12)
)
objective:客观的:
Add a column called last6m
, which counts the number of unique machine_ID
s observed in the last 6 months, within the associated machine_cat
.添加一个名为
last6m
的列,该列计算在关联的machine_cat
中最近 6 个月内观察到的唯一machine_ID
的数量。
tidyverse and no looping is prefered (purrr is ok). tidyverse 并且没有循环是首选(purrr 是可以的)。
Appriciate if anyone would take a quick look: Thanks in advance :-) Appriciate 如果有人会快速浏览一下:提前致谢 :-)
The following solution was obtained based on the post suggested by MrFlick and r2evans: G. Grothendieck's answer以下解决方案是根据 MrFlick 和 r2evans 所建议的帖子获得的: G. Grothendieck's answer
library(tidyverse)
library(lubridate)
library(sqldf)
data <- tibble(
date = seq(as.Date(paste0("2010-01-",runif(1,1,25))), by = "month", length.out = 24),
machine_ID = sample(letters[1:10],size = 24,replace = T),
machine_cat = rep(c(1,2),12)
)
sqldf("
SELECT a.*, COUNT(distinct(b.machine_ID)) AS last6m
FROM data a
LEFT JOIN data b
ON a.machine_cat = b.machine_cat
AND (b.date between a.date - 180 AND a.date)
GROUP BY a.rowid
") %>% arrange(machine_cat,date)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.