简体   繁体   中英

How do I delete rows in a data frame based on the value (date) in one of the columns?

I have a data frame that consists of daily data. It has 500,000+ rows and 18 columns . The 2nd column contains the date.

For example, it goes from 7/1/2017 to the current date, chronologically.

I pull the data every Monday and input it into R, but I only want data up until the most recent Friday.

I've set a variable equal to the most recent Friday's date (in the exact date format of the data):

library(lubridate)

LastFriday <- gsub("X", "", gsub("X0", "", format(
                                   Sys.Date() - wday(Sys.date()+1), "X%m/X%d/%Y))) 

which returns 9/15/2017

How do I delete all the rows in the data frame after the last row that contains last Friday's date?

The following should work, though I have not tested it

keep_index <- as.POSIXct(as.Date(df[,2]), "X%m/X%d/%Y") <= as.POSIXct(LastFriday, format = "X%m/X%d/%Y")
mydf <- df[keep_index, ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM