[英]Coding a new variable in R based on a field values of both categorical and numerical data
[英]Setting New categorical variable values based on Timestamp in R
我有一个带有两列Date_Time和ENTRY的数据框 。 Date_Time每小时捕获一次数据,为期2个月 ,我需要使用R中的逻辑来填充Entry列值。在下面的逻辑上-> Entry仅应在10 AM至12 AM和2PM to 5PM期间设置为“ FULL”-> Entry应设置为“空”, 不包括上午10点到12点和2PM到5PM的时间
预期输出快照
**Date_Time** **ENTRY**
6/6/17 6:00 AM EMPTY
6/6/17 7:00 AM EMPTY
6/6/17 8:00 AM EMPTY
6/7/17 9:00 AM EMPTY
6/8/17 10:00 AM FULL
6/9/17 11:00 AM FULL
6/9/17 12:00 AM FULL
6/9/2017 13:00 AM EMPTY
6/9/2017 14:00 AM FULL
6/9/2017 15:00 AM FULL
6/9/2017 16:00 AM FULL
6/9/2017 17:00 AM FULL
6/9/2017 18:00 AM EMPTY
使用data.table
解决方案(假设您的表名为d
):
library(data.table)
setDT(d)
d[, AMPM := sapply(strsplit(Data_Time, " "), "[[", 3)]
d[, TIME := as.numeric(gsub(":.*", "", sapply(strsplit(Data_Time, " "), "[[", 2)))]
d[, ENTRY := "EMPTY"]
d[(AMPM == "AM" & TIME >= 10 & TIME <= 12) |
(AMPM == "PM" & TIME >= 2 & TIME <= 5),
ENTRY := "FULL"][, .(Data_Time, ENTRY)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.