简体   繁体   中英

Subsetting R data frame automatically based on changing date range

I have an R script that I run monthly. I'd like to subset my data frame to only show data within a 6 month time period, but each month I'd like the time period to move forward one month.

Original data frame from Sept.:

ID  Name  Date
1   John  1/1/2020
2   Adam  5/2/2020
3   Kate  9/30/2020
4   Jill  10/15/2020

After subsetting for only dates from May 1, 2020 - Sept. 30, 2020:

ID  Name  Date
2   Adam  5/2/2020
3   Kate  9/30/2020

The next month when I run my script, I'd like the dates it's subsetting to move forward by one month, so June 1, 2020 - Oct. 31, 2020:

ID  Name  Date
3   Kate  9/30/2020
4   Jill  10/15/2020

Right now, I'm changing this part of my script manually each month, ie:

df$Date >= subset(df$Date >= '2020-05-01' & df$date <= '2020-09-30')

Is there a way to make this automatic, so that I don't have to manually move forward the date one month every time?

We can use between after converting the 'Date' to Date class

library(dplyr)
library(lubridate)
start <- as.Date("2020-05-01")
end <- as.Date("2020-09-30")

df1 %>%
    mutate(Date = mdy(Date)) %>%
    filter(between(Date, start, end))
#  ID Name       Date
#1  2 Adam 2020-05-02
#2  3 Kate 2020-09-30

In the next month, we can change the 'start', 'end' by adding 1 month

start <- start %m+% months(1)
end <-  ceiling_date(end %m+% months(1), 'months') - days(1)

start
#[1] "2020-06-01"
end
#[1] "2020-10-31"

using base R and no package dependency.

Data:

dt <- read.table(text = 'ID  Name  Date
1   John  1/1/2020
2   Adam  3/2/2021
3   Kate  12/30/2020
4   Jill  5/15/2021', header = TRUE, stringsAsFactors = FALSE)

Code:

date_format <-  "%m/%d/%Y"
dt$Date <- as.Date(dt$Date, format = date_format)
today <- Sys.Date()
six_month <- today+(6*30)
start <- as.Date(paste(format(today, "%m"), "01", 
                       format(today, "%Y"), sep = "/"), 
                 format = date_format)

end <- as.Date(paste(format(six_month, "%m"), "31", 
                     format(six_month, "%Y"), sep = "/"), 
               format = date_format)

dt[with(dt, Date >= start & Date <= end), ]
#   ID Name       Date
# 2  2 Adam 2021-03-02
# 3  3 Kate 2020-12-30
# 4  4 Jill 2021-05-15

This is a very simple solution:

library(lubridate)

t <- today() #automatic
t <- as.Date('2020-11-26') # manual (you can change it as you like)

start <- floor_date(t %m-% months(6), unit="months")
end   <- floor_date(t %m-% months(1), unit="months")-1

df$Date >= subset(df$Date >= start & df$date <= end)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM