简体   繁体   English

将数字与日期时间/时间戳关联

[英]Associate numbers to datetime/timestamp

I have a dataframe df with a certain number of columns. 我有一个具有一定数量列的数据框df One of them, ts , is timestamps: ts是时间戳:

1462147403122 1462147412990 1462147388224 1462147415651 1462147397069 1462147392497 ... 1463529545634 1463529558639 1463529556798 1463529558788 1463529564627 1463529557370 . 1462147403122 1462147412990 1462147388224 1462147415651 1462147397069 1462147392497 ... 1463529545634 1463529558639 1463529556798 1463529558788 1463529564627 1463529557370

I have also at my disposal the corresponding datetime in the datetime column: 我还可以在datetime列中使用相应的datetime

"2016-05-02 02:03:23 CEST" "2016-05-02 02:03:32 CEST" "2016-05-02 02:03:08 CEST" "2016-05-02 02:03:35 CEST" "2016-05-02 02:03:17 CEST" "2016-05-02 02:03:12 CEST" ... "2016-05-18 01:59:05 CEST" "2016-05-18 01:59:18 CEST" "2016-05-18 01:59:16 CEST" "2016-05-18 01:59:18 CEST" "2016-05-18 01:59:24 CEST" "2016-05-18 01:59:17 CEST"

As you can see my dataframe contains data accross several day. 如您所见,我的数据框包含几天的数据。 Let's say there are 3. I would like to add a column containing number 1, 2 or 3. 1 if the line belongs to the first day, 2 for the second day, etc... 假设有3个。我想添加一个包含数字1、2或3的列。如果该行属于第一天,则添加1,第二天属于2,依此类推...

Thank you very much in advance, Clement 提前谢谢你,克莱门特

One way to do this is to keep track of total days elapsed each time the date changes, as demonstrated below. 执行此操作的一种方法是跟踪每次日期更改时经过的总天数,如下所示。

# Fake data
dat = data.frame(datetime = c(seq(as.POSIXct("2016-05-02 01:03:11"), 
                                  as.POSIXct("2016-05-05 01:03:11"), length.out=6), 
                              seq(as.POSIXct("2016-05-09 01:09:11"), 
                                  as.POSIXct("2016-05-16 02:03:11"), length.out=4)))
tz(dat$datetime) = "UTC"

Note, if your datetime column is not already in a datetime format, convert it to one using as.POSIXct . 请注意,如果您的datetime列尚未采用日期时间格式,请使用as.POSIXct将其转换为日期as.POSIXct

Now, create a new column with the day number, counting the first day in the sequence as day 1. 现在,用天数创建一个新列,将序列中的第一天算作第一天。

dat$day = c(1, cumsum(as.numeric(diff(as.Date(dat$datetime, tz="UTC")))) + 1)

dat
  datetime day 1 2016-05-02 01:03:11 1 2 2016-05-02 15:27:11 1 3 2016-05-03 05:51:11 2 4 2016-05-03 20:15:11 2 5 2016-05-04 10:39:11 3 6 2016-05-05 01:03:11 4 7 2016-05-09 01:09:11 8 8 2016-05-11 09:27:11 10 9 2016-05-13 17:45:11 12 10 2016-05-16 02:03:11 15 

I specified the timezone in the code above to avoid getting tripped up by potential silent shifts between my local timezone and UTC. 我在上面的代码中指定了时区,以避免因本地时区和UTC之间的潜在静默变化而被绊倒。 For example, note the silent shift from my default local time zone ("America/Los_Angeles") to UTC when converting a POSIXct datetime to a date: 例如,将POSIXct日期时间转换为日期时,请注意从我的默认本地时区(“ America / Los_Angeles”)到UTC的无声转换:

# Fake data
datetime = seq(as.POSIXct("2016-05-02 01:03:11"), as.POSIXct("2016-05-05 01:03:11"), length.out=6)
tz(datetime)
[1] ""

date = as.Date(datetime)
tz(date)
[1] "UTC"

data.frame(datetime, date)
  datetime date 1 2016-05-02 01:03:11 2016-05-02 2 2016-05-02 15:27:11 2016-05-02 3 2016-05-03 05:51:11 2016-05-03 4 2016-05-03 20:15:11 2016-05-04 # Note day is different due to timezone shift 5 2016-05-04 10:39:11 2016-05-04 6 2016-05-05 01:03:11 2016-05-05 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM