简体   繁体   English

R日期顺序-不按天增加,而是按行/观察数增加

[英]R date sequence - increase not by days but by row/observation

I'm trying to select a date range from a data frame (later also by participant in said data frame). 我正在尝试从数据框中选择一个日期范围(后来也由所述数据框中的参与者)。 Usually, this is relatively easy IF you want to increase your date range by days for example. 通常,如果要例如将日期范围增加几天,这相对容易。

My problem is that I would not like to increase by days, but by rows to see when 100 observations were made. 我的问题是我不想按天增加,而是按行查看何时进行100次观察。 I guess the problem is that I do not have consecutive days in my data frame otherwise I could just do min(as.Date(data$date) + days(100) 我想问题是我的数据帧中没有连续的天,否则我只能做min(as.Date(data$date) + days(100)

I have also tried seq.Date(min(as.Date(data$date), length.out = 100, by = 1)) but that also does not work. 我也尝试了seq.Date(min(as.Date(data$date), length.out = 100, by = 1))但这也不起作用。

Here is some sample data: 以下是一些示例数据:

dates <- data.frame(date = c("2015-01-08", "2015-01-05", "2015-01-05", 
"2014-12-22", "2014-11-08", "2014-11-01", "2014-10-24", "2014-10-24", 
"2014-10-18", "2014-09-26", "2014-09-21", "2014-09-19", "2014-08-14", 
"2014-08-08", "2014-08-08", "2014-07-10", "2014-07-10", "2014-06-23", 
"2014-06-20", "2014-06-13", "2014-06-11", "2014-06-07", "2014-06-03", 
"2014-06-02", "2014-05-23", "2014-05-16", "2014-05-02", "2014-04-25",
"2014-04-11", "2014-04-09", "2014-04-01", "2014-03-27", "2014-03-25",
"2014-03-20", "2014-03-14", "2014-03-06", "2014-03-01"))

Now, when I run: seq.Date(min(as.Date(dates$date)), length.out = 20, by = 1) , I do get twenty dates: 现在,当我运行: seq.Date(min(as.Date(dates$date)), length.out = 20, by = 1) ,我得到了二十个日期:

[1] "2014-03-01" "2014-03-02" "2014-03-03" "2014-03-04" "2014-03-05" "2014-
03-06" "2014-03-07"
[8] "2014-03-08" "2014-03-09" "2014-03-10" "2014-03-11" "2014-03-12" "2014-
03-13" "2014-03-14"
[15] "2014-03-15" "2014-03-16" "2014-03-17" "2014-03-18" "2014-03-19" "2014-
03-20"

BUT: those are consecutive dates that do not match the dates in the data frame, and so I have no way of telling when 100 observations were made starting from the lowest/oldest date. 但是:这些是连续的日期,与数据框中的日期不匹配,因此我无法告诉您从最低/最早的日期开始进行100次观察的时间。

Any help would be greatly appreciated! 任何帮助将不胜感激! I am sure I can't be the only guy who has run into this issue...could not find anything here though. 我确定我不是唯一一个遇到此问题的人……虽然在这里找不到任何东西。

You can use the following: 您可以使用以下内容:

N = 20 # set N to be find difference between 1st and Nth time period
diff(sort(as.Date(dates$date))[c(1,N)])
# Time difference of 114 days

Breaking this down: 1) sort(as.Date(dates$date)) converts character vector to date type, and arranges them in ascending order. 分解如下:1) sort(as.Date(dates$date))将字符向量转换为日期类型,并以升序排列。 2) [c(1,N)] subsets to find the earliest (1st) date and the Nth one following that. 2) [c(1,N)]个子集可以找到最早的(第一个)日期,然后是第N个。 3) diff() calculates the difference between the two dates. 3) diff()计算两个日期之间的差。

Thanks to the help of @dww, I was able to construct the following function, which works beautifully (feel free to use): 感谢@dww的帮助,我能够构造以下函数,该函数很漂亮(可以随意使用):

    time_to_100 <- function(dataframe){

    N = 100 # set number of observations you want to 'check'

    output <- vector("double", length(levels(dataframe$part_id))) 
    # output vector based on number of indiv. part_ids (part_id = factor)

    for(part in dataframe$part_id){
       output[[part]] <-
    as.numeric(diff(sort(as.Date(dataframe[dataframe$part_id == 
    part,]$created))[c(1,N)]), units = "days") # created = the date column
    }

    return(output)
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM