简体   繁体   English

根据R中单独数据帧中的时间点对时间序列进行分类

[英]Categorize time series based on time points in separate dataframe in R

I have a time sequence with intervals of 10 minutes that I want to categorize according to tidal stage (low tide, high tide). 我有一个时间间隔为10分钟的时间序列,我想根据潮汐阶段(低潮,高潮)进行分类。 Ideally ending up with eg : 理想地以例如结尾:

     date_time    tidal_stage
30/05/2016 10:50  low
30/05/2016 11:00  low
30/05/2016 11:10  mid
30/05/2016 11:20  mid
30/05/2016 11:30  mid
30/05/2016 11:40  mid
30/05/2016 11:50  high
30/05/2016 12:00  high

Time sequence already generated using: 已使用以下命令生成了时间序列:

start_time <- as.POSIXct("2016-05-30 10:50:00", tz="CET")
end_time <- as.POSIXct("2016-07-20 08:50:00", tz="CET")
time_seq <- seq(from=start_time, to=end_time, by="10 min")

I have a separate data frame "hw_lw" containing the times of low water and high water for each date in the time series: 我有一个单独的数据框“ hw_lw”,其中包含时间序列中每个日期的低水位和高水位时间:

     high_water           low_water       date
1 2016-05-30 07:39:00 2016-05-30 04:14:00 2016-05-30
2 2016-05-30 20:01:00 2016-05-30 16:35:00 2016-05-30
3 2016-05-31 08:49:00 2016-05-31 05:17:00 2016-05-31
4 2016-05-31 21:14:00 2016-05-31 17:48:00 2016-05-31
5 2016-06-01 10:04:00 2016-06-01 06:30:00 2016-06-01
6 2016-06-01 23:36:00 2016-06-01 19:09:00 2016-06-01

How can I add the "tidal_stage" column to the time sequence which categorizes each time as "low", "high" or "mid" tide, where "low tide" = 1.5hrs before and after low water; 我该如何在时间序列中添加“ tidal_stage”列,将其分别分类为“低潮”,“高潮”或“中潮”,其中“低潮” =在低水位前后1.5小时; "high tide" = 1.5hrs before and after high water, and "mid tide" = all other points? “高潮” =高水前后1.5小时,“中潮” =所有其他时间?

I have thought about using subset, but I have only found out how to do this between specific time intervals (eg between 1pm and 2pm), and not when adding or subtracting time to a specific timepoint (eg 1.5 hours after 2pm). 我曾经考虑过使用子集,但是我只发现了如何在特定的时间间隔(例如1pm和2pm之间)执行此操作,而不是在特定时间点添加或减去时间(例如2pm之后1.5小时)时没有这样做。

Any help much appreciated! 任何帮助,不胜感激! Thank you. 谢谢。

First of all you need to change the format of your hw_dw dataframe as you have two low waters and two high waters per day: 首先,由于每天有两个低hw_dw和两个高hw_dw ,因此您需要更改hw_dw数据帧的格式:

hw_lw2=data.frame(hw_lw[seq(1,nrow(hw_lw),by=2)],hw_lw[seq(2,nrow(hw_lw),by=2),1:2])
names(hw_lw2)=c("high_water1","low_water1","date","high_water2","low_water2")

Add a tidal_stage column to your first dataframe df, and initalize it to "mid", and have a date column in each dataframe. 在第一个数据框df中添加tidal_stage列,并将其初始化为“ mid”,并在每个数据框中都有一个日期列。

df$tidal_stage=rep("mid",nrow(df))
df$date=as.Date(df$time_date)
hw_lw2$date=as.Date(hw_lw2$date)

Then you can perform a left join on the two data.frames using the date as the key, and find out the tidal stages: 然后,您可以使用日期作为键在两个data.frame上执行左联接,并找出潮汐阶段:

df2=merge(df,hw_lw2,by="date")
dt=as.difftime(1.5,units="hours")
df2$tidal_stage[(df2$date_time>(df2$low_water1-dt) & df2$date_time<(df2$low_water1+dt)) | (df2$date_time>(df2$low_water2-dt) & df2$date_time<(df2$low_water2+dt))]="low"
df2$tidal_stage[(df2$date_time>(df2$high_water1-dt) & df2$date_time<(df2$high_water1+dt)) | (df2$date_time>(df2$high_water2-dt) & df2$date_time<(df2$high_water2+dt))]="high"

Finally you can remove the unwanted columns: 最后,您可以删除不需要的列:

df2=subset(df2,select=c("date_time","tidal_stage"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在R时间序列数据框中,如何基于正则表达式进行分离和分类 - In R time series dataframe, how to separate and categorize based on regex 在 R 中,基于正则表达式对时间序列数据进行分类 - In R, categorize time series data based on regex 在 R 中使用 dplyr 根据天数分离降雨时间序列 - Separate a rainfall time series based on duration of days using dplyr in R 将 R dataframe 转换为时间序列 - Transforming R dataframe into time series 将数据帧转换为R中的时间序列 - converting dataframe to time series in R 根据列名在 R dataframe 中创建列以制作时间序列 - Create column in R dataframe based on name of the column to make a time series 将日期和时间的数据帧转换为R中的时间序列 - Converting dataframe with date and time to time series in R 基于时间间隔的R中时间序列数据的局部和全局极值点 - Local and global extremum points on time series data in R based on time intervals 基于索引值(时间点)滚动时间序列数据帧的子集,而不是观察数R - Rolling subsetting of a time series data frame based on index values (time points), not number of observations R 将时间序列中的点与 R 中的 NA 字段连接起来 - connect points in a time series with NA fields in R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM