[英]How to combine two columns of time in R?
I have two text files: 我有两个文本文件:
1- 1-
> head(val)
V1 V2 V3
1 2015/03/31 00:00 0.134
2 2015/03/31 01:00 0.130
3 2015/03/31 02:00 0.133
4 2015/03/31 03:00 0.132
2- 2-
> head(tes)
A B date
1 0.04 0.02 2015-03-31 02:18:56
What I need is to combine V1 (date) and V2 (hour) in val
. 我需要在val
组合V1(日期)和V2(小时)。 search in val
the date and time that correspond (the closest) to date
in tes
and then extract the corresponding V3
and put it in tes
. 在val
搜索与tes
date
相对应(最接近)的日期和时间,然后提取相应的V3
并将其放在tes
。
the desired out put would be: 所需的输出是:
tes
A B date V3
1 0.04 0.02 2015-04-01 02:18:56 0.133
Updated answer based on OP's comments. 根据OP的评论更新了答案。
val$date <- with(val,as.POSIXct(paste(V1,V2), format="%Y/%m/%d %H:%M"))
val
# V1 V2 V3 date
# 1 2015/03/31 00:00 0.134 2015-03-31 00:00:00
# 2 2015/03/31 01:00 0.130 2015-03-31 01:00:00
# 3 2015/03/31 02:00 0.133 2015-03-31 02:00:00
# 4 2015/03/31 03:00 0.132 2015-03-31 03:00:00
# 5 2015/04/07 13:00 0.080 2015-04-07 13:00:00
# 6 2015/04/07 14:00 0.082 2015-04-07 14:00:00
tes$date <- as.POSIXct(tes$date)
tes
# A B date
# 1 0.04 0.02 2015-03-31 02:18:56
# 2 0.05 0.03 2015-03-31 03:30:56
# 3 0.06 0.04 2015-03-31 05:30:56
# 4 0.07 0.05 2015-04-07 13:42:56
f <- function(d) { # for given tes$date, find val$V3
diff <- abs(difftime(val$date,d,units="min"))
if (min(diff > 45)) Inf else which.min(diff)
}
tes <- cbind(tes,val[sapply(tes$date,f),c("date","V3")])
tes
# A B date date V3
# 1 0.04 0.02 2015-03-31 02:18:56 2015-03-31 02:00:00 0.133
# 2 0.05 0.03 2015-03-31 03:30:56 2015-03-31 03:00:00 0.132
# 3 0.06 0.04 2015-03-31 05:30:56 <NA> NA
# 4 0.07 0.05 2015-04-07 13:42:56 2015-04-07 14:00:00 0.082
The function f(...)
calculates the index into val
(the row number) for which val$date
is closest in time to the given tes$date
, unless that time is > 45 min, in which case Inf
is returned. 函数f(...)
计算val
(行号)的索引,其中val$date
在时间上最接近给定的tes$date
,除非该时间> 45分钟,在这种情况下将返回Inf
。 Using this function with sapply(...)
as in: 在sapply(...)
使用此函数, sapply(...)
:
sapply(tes$date, f)
returns a vector of row numbers in val
matching your condition for each test$date
. 对于每个test$date
返回一个匹配val
的行编号向量。
The reason we use Inf
instead of NA
for missing values is that indexing a data.frame using Inf
always returns a single "row" containing NA
, whereas indexing using NA
returns nrow(...)
rows all containing NA
. 我们使用Inf
而不是NA
来缺少值的原因是,使用Inf
索引data.frame总是返回包含NA
的单个“行”,而使用NA
索引则返回全部包含NA
nrow(...)
行。
I added the extra rows into val
and tes
per your comment. 我根据您的评论将多余的行添加到val
和tes
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.