I have the following two data frames:
Date <- seq(as.Date("2013/1/1"), by = "day", length.out = 46)
x <-data.frame(Date)
x$discharge <- c("1000","1100","1200","1300","1400","1200","1300","1300","1200","1100","1200","1200","1100","1400","1200","1100","1400","1000","1100","1200","1300","1400","1200","1300","1300","1200","1100","1200","1200","1100","1400","1200","1100","1400","1000","1100","1200","1300","1400","1200","1300","1300","1200","1100","1200","1200")
x$discharge <- as.numeric(x$discharge)
And
Date_from <- c("2013-01-01","2013-01-15","2013-01-21","2013-02-10")
Date_to <- c("2013-01-07","2013-01-20","2013-01-25","2013-02-15")
y <- data.frame(Date_from,Date_to)
y$concentration <- c("1.5","2.5","1.5","3.5")
y$Date_from <- as.Date(y$Date_from)
y$Date_to <- as.Date(y$Date_to)
y$concentration <- as.numeric(y$concentration)
I am trying to calculate the average discharge from the daily discharges in data frame x
for each row in data frame y
based on the date range Date_from
to Date_to
in data frame y
. Notice, that there is a gap in the measurements in data frame y
between 2013-01-08 to 2013-01-14, and 2013-01-26 to 2013-02-09. This gap is due to the fact that no measurements were taken during this time. And this gap is causing me headaches as I was using the following code to calculate the average discharge for each date range in y
:
rng <- cut(x$Date, breaks=c(y$Date_from, max(y$Date_to),
include.lowest=T))
range<-cbind(x,rng)
discharge<-aggregate(cbind(mean=x$discharge)~rng, FUN=mean)
However, if you check the range in data frame range
the range for 2013-01-01 to 2013-01-07 is extended up to 2013-01-14 but I only need it to 2013-01-07 and than with a break until the next range begins on 2013-01-15.
You can try a tidyverse
.
library(tidyverse)
y %>%
split(seq_along(1:nrow(.))) %>%
map(~filter(x, between(Date, .$Date_from, .$Date_to)) %>%
summarise(Mean=mean(discharge))) %>%
bind_rows() %>%
bind_cols(y,.)
Date_from Date_to concentration Mean
1 2013-01-01 2013-01-07 1.5 1214.286
2 2013-01-15 2013-01-20 2.5 1166.667
3 2013-01-21 2013-01-25 1.5 1300.000
4 2013-02-10 2013-02-15 3.5 1216.667
Using only this code you can see the values and groups.
y %>%
split(seq_along(1:nrow(.))) %>%
map(~filter(x, between(Date, .$Date_from, .$Date_to)))
Here's a base
answer:
helper <- merge(x, y)
helper <- helper[helper$Date >= helper$Date_from & helper$Date <= helper$Date_to, ]
aggregate(helper$discharge,
list(Date_from = helper$Date_from,
Date_to = helper$Date_to),
FUN = 'mean')
Date_from Date_to x
1 2013-01-01 2013-01-07 1214.286
2 2013-01-15 2013-01-20 1166.667
3 2013-01-21 2013-01-25 1300.000
4 2013-02-10 2013-02-15 1216.667
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.