简体   繁体   English

几个日期之间的时间间隔(天和小时)

[英]Time interval between several dates (days and hours)

I know a lot of questions have been asked on the same subject but I have not found an answer to this particular question, despite trying to adapt other codes to my problem.我知道已经就同一主题提出了很多问题,但我还没有找到这个特定问题的答案,尽管我试图调整其他代码来解决我的问题。

My data frame "v1" has more than 300 thousand lines with the variable "Date" in the following format:我的数据框“v1”有超过 30 万行,变量“日期”采用以下格式:

Date日期
2015-07-27 17:35:00 2015-07-27 17:35:00
2015-07-27 17:40:00 2015-07-27 17:40:00
2015-07-27 17:45:00 2015-07-27 17:45:00

1st I want to know if all the "Date" intervals are in the 5 to 5 minutes interval.第一个我想知道是否所有的“日期”间隔都在 5 到 5 分钟的间隔内。 If not I would like to track where different intervals are.如果不是,我想跟踪不同间隔的位置。

2nd I pretend to create a new column where it can be seen the time stamp of the different intervals.第二,我假装创建一个新列,可以看到不同间隔的时间戳。 For example, "time_int" where it would be seen "00:05:00", "00:05:00"...例如,“time_int”会出现“00:05:00”、“00:05:00”...

Any help will be appreciated.任何帮助将不胜感激。 Thank you in advance.先感谢您。

You can use rollapplyr to find the time difference between two consecutive rows.您可以使用rollapplyr来查找两个连续行之间的时间差。 And then you can use which to find the rows that the time difference is not 5 minutes.然后您可以使用which查找时差不是 5 分钟的行。

dt=read.table(text=text, header=TRUE)
library(lubridate)
library(dplyr)
library(zoo)
dt=mutate(dt, Date=ymd_hms(Date)) %>%
  mutate(dt, Dif=rollapplyr(Date, 2, function(x) {
  return(difftime(x[2], x[1]))
}, fill=NA))
dt
                 Date Dif
1 2015-07-27 17:35:00  NA
2 2015-07-27 17:40:00   5
3 2015-07-27 17:45:00   5
4 2015-07-27 17:49:00   4

dt[which(dt$Dif != as.difftime(5, units="mins")),]
                 Date Dif
4 2015-07-27 17:49:00   4

Lastly, to format the times in your desired format:最后,以您想要的格式格式化时间:

dt %>% mutate(DifString=format(.POSIXct(Dif*60, tz="GMT"), "%H:%M:%S"))
                 Date Dif DifString
1 2015-07-27 17:35:00  NA      <NA>
2 2015-07-27 17:40:00   5  00:05:00
3 2015-07-27 17:45:00   5  00:05:00
4 2015-07-27 17:49:00   4  00:04:00

Data数据

text="Date
'2015-07-27 17:35:00'
'2015-07-27 17:40:00'
'2015-07-27 17:45:00'
'2015-07-27 17:49:00'"
dt=read.table(text=text, header=TRUE)

Here is an option to calculate the difference using lag .这是一个使用lag计算差异的选项。 If you'd like, you could create another column showing hours with units = "hours" .如果您愿意,您可以创建另一个显示小时数的列, units = "hours"

library(tidyverse)
library(lubridate)


df <- data.frame(date = ymd_hms(c("2015-07-27 17:35:00", 
"2015-07-27 17:40:00", "2015-07-27 17:49:00", "2015-07-27 19:49:00")))

df %>% 
  mutate(diff = date - lag(date),
         diff_minutes = as.numeric(diff, units = "mins"),
         time_int = format(.POSIXct(diff_minutes*60, "UTC"), "%H:%M:%S")) %>% 
  select(date, diff_minutes, time_int) %>% 
  # Filter the data for a range of minutes
  filter(diff_minutes >= 5 & diff_minutes < 10)

# OUTPUT:

#>                  date diff_minutes time_int
#> 1 2015-07-27 17:40:00            5 00:05:00
#> 2 2015-07-27 17:49:00            9 00:09:00

Created on 2021-03-09 by the reprex package (v0.3.0)reprex package (v0.3.0) 于 2021 年 3 月 9 日创建

Original Data原始数据

date
<S3: POSIXct>
2015-07-27 17:35:00             
2015-07-27 17:40:00             
2015-07-27 17:49:00             
2015-07-27 19:49:00 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM