简体   繁体   中英

Date comparison with System Date in R

I want to compare the data from one column which is the end date(end_date) with the system date(todays_date). Both columns are in the char format.

Input:
$ name: chr  "Abby" "Abby" "Abby" "Abby" ...
$ std: int  2 3 4 5 6 7 8 9 10 11 ...
$ end_date: chr  "25-02-2016" "25-02-2016" "25-03-2018" "25-02-2019" ...
$ todays_date: chr  "07-03-2018" "07-03-2018" "07-03-2018" "07-03-2018" ...

Is there any way I can pass a sqldf statement where I can get all the values of the input csv where end_date < todays_date? Any way other than a sqldf statement where I can extract the values of the csv where end_date< todays_date will do.

I tried a few possible variations the below query but I can't seem to get the required output:

sel_a <- sqldf(paste("SELECT * FROM input_a WHERE end_date<", 
todays_date, "", sep = ""))
sel_a

PS: I have a huge amount of data and have reduced it to fit this question.

Any help would be appreciated.

To get a more specific answer, make a reproducible example

Convert the date column from character to date-time objects, eg, with

library(lubridate)
your_df$end_date <- mdy(your_df$end_date)

Then, you don't even need a column for todays date, just use it as a filter condition

library(dplyr)
filter(your_df, end_date < Sys.Date())
# will return a data frame with those rows that have a date before today.

Or if you prefer:

your_df[your_df$end_date < Sys.Date(),]
# produces the same rows

Using the raw input shown in the Note at the end first convert the dates to "Date" class and then use any of the alternatives shown. The first two use end_date in the input and the last two use Sys.Date() . We show both sqldf and base solutions.

library(sqldf)
fmt <- "%d-%m-%Y"
Input <- transform(Input_raw, end_date = as.Date(end_date, fmt),
                              todays_date = as.Date(todays_date, fmt))

# 1
sqldf("select * from Input where end_date <= todays_date")

# 2
subset(Input, end_date <= todays_date)

# 3
fn$sqldf("select * from Input where end_date <= `Sys.Date()`")

# 4
subset(Input, end_date <= Sys.Date())

Note

The Input in reproducible form:

Input_raw <- data.frame(name = "Abby", std = 2:5, 
  end_date = c("25-02-2016", "25-02-2016", "25-03-2018", "25-02-2019"),
  todays_date = "07-03-2018", stringsAsFactors = FALSE)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM