简体   繁体   English

如何根据列之一中的值(日期)删除数据框中的行?

[英]How do I delete rows in a data frame based on the value (date) in one of the columns?

I have a data frame that consists of daily data. 我有一个包含每日数据的数据框。 It has 500,000+ rows and 18 columns . 它具有500,000+行和18列 The 2nd column contains the date. 第二列包含日期。

For example, it goes from 7/1/2017 to the current date, chronologically. 例如,它按时间顺序从2017年7月1日到当前日期。

I pull the data every Monday and input it into R, but I only want data up until the most recent Friday. 我每个星期一提取数据并将其输入到R中,但是我只希望数据一直到最近的星期五。

I've set a variable equal to the most recent Friday's date (in the exact date format of the data): 我设置了一个变量,该变量等于最近的星期五的日期(以数据的确切日期格式):

library(lubridate)

LastFriday <- gsub("X", "", gsub("X0", "", format(
                                   Sys.Date() - wday(Sys.date()+1), "X%m/X%d/%Y))) 

which returns 9/15/2017 返回9/15/2017

How do I delete all the rows in the data frame after the last row that contains last Friday's date? 如何删除包含上周五日期的最后一行之后的数据框中的所有行?

The following should work, though I have not tested it 以下应该可以工作,尽管我还没有测试过

keep_index <- as.POSIXct(as.Date(df[,2]), "X%m/X%d/%Y") <= as.POSIXct(LastFriday, format = "X%m/X%d/%Y")
mydf <- df[keep_index, ]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM