简体   繁体   English

返回R中的过去最近或等效日期

[英]Return Past Closest or Equivalent dates in R

I have two data frames mondays & tdates as follows : 我在mondaystdates有两个数据框,如下所示:

T Dates 
User.ID   tdate
1       11-02-2013
1       04-03-2013
1       16-04-2015
1       03-05-2015
1       05-05-2015
1       11-05-2015
1       29-09-2015
1       26-11-2013
1       28-11-2013
3       01-02-2016
4       22-11-2012
4       25-04-2013
4       29-05-2013



Mondays     
ID  Monday      Closest Date
1   05-09-2016  
1   20-04-2015  
1   27-07-2015  
1   08-06-2015  
1   13-10-2014  
3   16-09-2013  
3   16-02-2015  
3   29-08-2016  
3   26-05-2014  
3   29-02-2016  
3   18-07-2016  
3   22-02-2016  
4   16-11-2015  

Now i want to return the past closest or equivalent date in 3rd column from tdates for each of the User.ID in mondays . 现在,我想从返回第3列过去的最近或等同日期tdates每个的User.IDmondays For eg the expected output is 例如,预期输出为

Mondays       
ID  Monday      Closest Date
1   05-09-2016  29-09-2015
1   20-04-2015  16-04-2015
1   27-07-2015  11-05-2015
1   08-06-2015  11-05-2015
1   13-10-2014  28-11-2013
3   16-09-2013  NA
3   16-02-2015  NA
3   29-08-2016  01-02-2016
3   26-05-2014  NA
3   29-02-2016  01-02-2016
3   18-07-2016  01-02-2016
3   22-02-2016  01-02-2016
4   16-11-2015  29-05-2013

For ID = 1 & Monday = 05-09-2016 ID = 1Monday = 05-09-2016

the past closest tdate is 29-09-2015 thus it'll get this date in Closest Date column 过去最接近tdate29-09-2015因此会得到这个日期的Closest Date

Note : If no transaction date is found to past or equivalent to monday's date fill NAs 注意:如果未找到过去或等于星期一的交易日期,请填写NAs

This has to be done for a very large data set , any ideas how this can be done . 必须对非常大的数据集执行此操作,无论如何执行此操作都有任何想法。 I have tried this using a customized function as follows : 我已经尝试过使用自定义函数,如下所示:

lasttxndate <- function(userid, mydate){
+     return(max(subset(tdates$Date.Asked, tdates$User.ID == userid & tdates$Date.Asked <= as.Date(mydate))))
+ }

But this isn't working out when using this with lapply' or sapply`. 但这不适用于lapply' or sapply`。

# date conversion
mondays$Monday <- as.Date(mondays$Monday, "%d-%m-%Y")
tdates$tdate <- as.Date(tdates$tdate, "%d-%m-%Y")

# convert to data.table
library(data.table) 
setDT(mondays) 
setDT(tdates)

# you need identical column names for join
tdates[, ID := User.ID, ]
tdates[, Monday := tdate, ]

tdates[mondays, on = c("ID", "Monday"), roll = Inf]

    User.ID      tdate ID     Monday
 1:       1 2015-09-29  1 2016-09-05
 2:       1 2015-04-16  1 2015-04-20
 3:       1 2015-05-11  1 2015-07-27
 4:       1 2015-05-11  1 2015-06-08
 5:       1 2013-11-28  1 2014-10-13
 6:      NA       <NA>  3 2013-09-16
 7:      NA       <NA>  3 2015-02-16
 8:       3 2016-02-01  3 2016-08-29
 9:      NA       <NA>  3 2014-05-26
10:       3 2016-02-01  3 2016-02-29
11:       3 2016-02-01  3 2016-07-18
12:       3 2016-02-01  3 2016-02-22
13:       4 2013-05-29  4 2015-11-16

tdate column gives you the desired dates tdate您提供所需的日期

This code works well: 该代码运行良好:

T.Dates <- data.frame( 
User.ID=c("1","1","1","1","1","1","1","1","1","3","4","4","4"),
tdate=as.Date(c("11-02-2013","04-03-2013","16-04-2015","03-05-2015","05-05-2015","11-05-2015","29-09-2015","26-11-2013","28-11-2013","01-02-2016","22-11-2012","25-04-2013","29-05-2013"),format="%d-%m-%Y"))


Mondays <- data.frame( 
  ID=c("1","1","1","1","1","3","3","3","3","3","3","3","4"),
  Monday=as.Date(c("05-09-2016","20-04-2015","27-07-2015","08-06-2015","13-10-2014","16-09-2013","16-02-2015","29-08-2016","26-05-2014","29-02-2016","18-07-2016","22-02-2016","16-11-2015"),format="%d-%m-%Y"))

Mondays$Closest.Date <- NA
Mondays$Closest.Date <- as.Date(Mondays$Closest.Date, format="%d-%m-%Y")

for(i in 1:nrow(Mondays)){
Mondays[i,"Closest.Date"] <- max(T.Dates$tdate[T.Dates$User.ID==Mondays$ID[i] & T.Dates$tdate <= Mondays[i,"Monday"]])  
}

The output: 输出:

> Mondays
   ID     Monday Closest.Date
1   1 2016-09-05   2015-09-29
2   1 2015-04-20   2015-04-16
3   1 2015-07-27   2015-05-11
4   1 2015-06-08   2015-05-11
5   1 2014-10-13   2013-11-28
6   3 2013-09-16         <NA>
7   3 2015-02-16         <NA>
8   3 2016-08-29   2016-02-01
9   3 2014-05-26         <NA>
10  3 2016-02-29   2016-02-01
11  3 2016-07-18   2016-02-01
12  3 2016-02-22   2016-02-01
13  4 2015-11-16   2013-05-29

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将 R 中的日期转换为未来日期,而不是过去的日期 - converting dates in R is changing to future dates, not past 按R中的过去日期(12个月)汇总 - aggregate by past dates (12 months) in R 返回最接近 R 中给定日期的日期 - Return closest date to a given date in R 返回最接近 R 中特定列的值 - Return closest values to a specific column in R 返回在R中建立“最接近的值”的行 - Return rows establishing a “closest value to” in R 在 r 中格式化 {closest_state} output 的日期(条形图竞赛) - Formatting dates for {closest_state} output in r (bar chart race) R - sqldf [推算两个不同日期的数据集之间最接近的值] - R - sqldf [Impute the closest value between two datasets with different dates] 通过填写缺失的日期并通过上下对称迭代日期以找到 r 中可用的最接近值来平均插补 - mean imputation by filling in missing dates and by symetrically iterating over dates up and down to find the closest value available in r R:在大型数组中查找最接近的值并返回尺寸 - R: Find closest value in large array and return dimensions 为 R 中的几个数据帧返回与给定日期最接近的日期的行 - Return rows with closest date to a given date for several dataframes in R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM