[英]How do I delete rows in a data frame based on the value (date) in one of the columns?
I have a data frame that consists of daily data. 我有一个包含每日数据的数据框。 It has 500,000+ rows and 18 columns .
它具有500,000+行和18列 。 The 2nd column contains the date.
第二列包含日期。
For example, it goes from 7/1/2017 to the current date, chronologically. 例如,它按时间顺序从2017年7月1日到当前日期。
I pull the data every Monday and input it into R, but I only want data up until the most recent Friday. 我每个星期一提取数据并将其输入到R中,但是我只希望数据一直到最近的星期五。
I've set a variable equal to the most recent Friday's date (in the exact date format of the data): 我设置了一个变量,该变量等于最近的星期五的日期(以数据的确切日期格式):
library(lubridate)
LastFriday <- gsub("X", "", gsub("X0", "", format(
Sys.Date() - wday(Sys.date()+1), "X%m/X%d/%Y)))
which returns 9/15/2017
返回
9/15/2017
How do I delete all the rows in the data frame after the last row that contains last Friday's date? 如何删除包含上周五日期的最后一行之后的数据框中的所有行?
The following should work, though I have not tested it 以下应该可以工作,尽管我还没有测试过
keep_index <- as.POSIXct(as.Date(df[,2]), "X%m/X%d/%Y") <= as.POSIXct(LastFriday, format = "X%m/X%d/%Y")
mydf <- df[keep_index, ]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.