[英]How to calculate a mean on a datable in R based on several conditions
I have data like the following :我有如下数据:
library(lubridate)
library(dplyr)
library(data.table)
MWE <- data.table(
Date=rep(seq(ymd("2020-1-1"), ymd("2020-3-30"), by = "days"),each=6),
Country=rep(c("France","United States","Germany"),90*6),
TransportType=rep(c("Train","Cars"),each=3,90*3),
Value=rnorm(90*6,2,3)
)
I want to create a new variable, that is the mean of value :我想创建一个新变量,即值的均值:
So the mean should be calculated on January and February, but in the database for the whole period.所以平均值应该在 1 月和 2 月计算,但在整个时期的数据库中。
I have managed to do the first two (or I think so, I am checking) :我已经设法做到了前两个(或者我认为是这样,我正在检查):
MWE_2 <- MWE %>%
.[,JourSem:=weekdays(Date)] %>%
.[,Moyenne:=mean(Value),by=.(Country,JourSem,TransportType)]
But I am unsure how to pass another condition in that.但我不确定如何通过另一个条件。 I think I get it form this我想我明白了
MWE_3 <- MWE %>%
.[,JourSem:=weekdays(Date)] %>%
.[Date <= "2020-02-29",Moyenne:=mean(Value),by=.(Country,JourSem,TransportType)]
But I lack the value for March dates, which is logical, as they are filtered out, which is therefore not what I want.但是我缺少三月日期的值,这是合乎逻辑的,因为它们被过滤掉了,因此这不是我想要的。
We can first calculate mean for January and February month for each weekday and then join this data with March data.我们可以首先计算每个工作日的 1 月和 2 月的平均值,然后将这些数据与 3 月的数据结合起来。
library(data.table)
MWE[, JourSem:=weekdays(Date)]
d1 <- MWE[Date <= as.Date("2020-02-29")] %>%
.[, .(Moyenne = mean(Value)), JourSem]
MWE[Date > as.Date("2020-02-29")][d1, on = 'JourSem']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.