通过基于列的计算在R中过滤数据帧

Question

I have a data frame with multiple columns out of which two of them are dates. 我有一个包含多个列的数据框，其中两个是日期。 From one date column, I want to calculate all Sundays from that (date - 14) till todays date. 我想从一个日期列中计算从该日期（日期-14）到今天的所有星期日。 I,then, want to filter my data where the other date column is equal to these dates. 然后，我想在其他日期列等于这些日期的地方过滤我的数据。 Below is an example : ( original data is much bigger than this one) 下面是一个示例：（原始数据远大于此数据）

ex_data <- data.frame(
  c("2018-05-27", "2018-06-24", "2018-07-01", "2018-07-08", "2018-06-25",
    "2018-07-05", "2018-07-10", "2018-05-30", "2018-06-20", "2018-07-04", 
    "2017-12-05"),   
  c("2018-05-13", "2018-02-04", "2018-06-17", "2018-06-10", "2018-04-04", 
    "2018-01-14", "2018-06-17", "2018-06-24", "2018-07-01", "2017-12-03",
    "2018-06-17"), 
  c(rep("1", 4), rep("2", 3), rep("3", 2), rep("1", 1),5),   
  c(rep("xxx", 4), rep("yyyy", 3), rep("zz", 2), rep("xxx", 1),"ttt"))


colnames(ex_data) <- c("Date1", "Date2", "Ex1", "Ex2")

I want to find the Sundays two weeks from Date1 to today (lets call it "previousSundays"). 我想找到从Date1到今天两周的星期日（我们称其为“ previousSundays”）。 The result for each row is a list/vector of sundays from corresponding value of date1 to today. 每行的结果是从date1的对应值到今天的星期日的列表/向量。 For example, for the first row it would be: 例如，对于第一行，它将是：

"2018-05-13" "2018-05-20" "2018-05-27" "2018-06-03" "2018-06-10"
"2018-06-17" "2018-06-24" "2018-07-01" "2018-07-08" "2018-07-15" 
"2018-07-22" "2018-07-29"

I then want to filter my data frame so that I have only the values where Date2 equals to "previousSundays". 然后，我想过滤数据框，以便仅具有Date2等于“ previousSundays”的值。

The desired output looks like as below : ( did the calculation one by one only for the first three rows) 所需的输出如下所示：（仅对前三行进行了一次计算）

   Date1           Date2       Ex1  Ex2            
   2018-05-27   2018-05-13       1  xxx
   2018-07-01   2018-06-17       1  xxx

Any ideas what would be the best way to do it in R? 有什么想法在R中做到这一点的最佳方法是什么？ I used lapply and seq function but it did not work. 我使用了lapply和seq函数，但是没有用。 Below is what I tried to do: 以下是我尝试执行的操作：

ex_data$prevdays <- lapply(ex_data$Date1 - 14, seq, var2 = Sys.Date(), by = "week")

(and some variants of the line above) （以及上面一行的一些变体）

I searched already the website/internet but could not find a solution that addresses my problem. 我已经搜索过网站/互联网，但是找不到解决我的问题的解决方案。 Any suggestion is appreciated as I can not find an elegant way to solve this problem. 感谢我的任何建议，因为我找不到解决该问题的优雅方法。

Answer 1

It appears that you can do this by setting up a series of conditionals. 您似乎可以通过设置一系列条件来做到这一点。

# first recode the dates
ex_data[, 1:2] <- lapply(ex_data[, 1:2], as.Date)

# check if date is a sunday    
is.sunday <- format(ex_data$Date2, "%u") == 7

today <- Sys.Date()

# slightly more tricky. Aggregate Date1 over Ex2 and find the minima
mins <- aggregate(Date1 ~ Ex2, data=ex_data, min)

# use the result as a lookup-table
mins <- mins$Date1[match(ex_data$Ex2, mins$Ex2)]

# combine (create the intersect of) the conditions
matches <- ex_data$Date2 > mins - 14 & ex_data$Date2 < today & is.sunday

# and filter the data
ex_data[matches,]

#         Date1      Date2 Ex1  Ex2
# 3  2018-07-01 2018-06-17   1  xxx
# 4  2018-07-08 2018-06-10   1  xxx
# 7  2018-07-10 2018-06-17   2 yyyy
# 8  2018-05-30 2018-06-24   3   zz
# 9  2018-06-20 2018-07-01   3   zz
# 11 2017-12-05 2018-06-17   5  ttt

通过基于列的计算在R中过滤数据帧

问题描述

1 个解决方案

解决方案1
0 2018-07-30 10:35:50

通过基于列的计算在R中过滤数据帧

问题描述

1 个解决方案

解决方案1 0 2018-07-30 10:35:50

解决方案1
0 2018-07-30 10:35:50