简体   繁体   English

尝试将R中每24小时的昼夜时段汇总(并平均)时间序列数据(每五分钟收集一次)

[英]Trying to aggregate (and average) time series data (collected every five minutes) into day and night periods for each 24 hr period in R

Thanks in advance! 提前致谢!

I have time series data that was collected every five minutes, the head() of which looks like… 我有每五分钟收集一次的时间序列数据,其head()看起来像…

PumaID CollarID  Type  GMT_Date    GMT_Time  LMT_Date    LMT_Time ActivityX
1    P01     2905 Argos  1/1/2000 12:15:00 AM  1/1/2000 12:00:00 AM         0
2    P01     2905 Argos  1/1/2000 12:25:00 AM  1/1/2000 12:00:00 AM         0
3    P01     2905 Argos  1/1/2000 12:00:00 AM  1/1/2000 12:00:00 AM         0
4    P01     2905 Argos 2/21/2011  9:15:00 PM 2/21/2011  2:15:00 PM         0
5    P01     2905 Argos 2/21/2011  9:20:00 PM 2/21/2011  2:20:00 PM        18
6    P01     2905 Argos 2/21/2011  9:25:00 PM 2/21/2011  2:25:00 PM        14
  ActivityY ActivitySum DayNight Temp
1         0           0    Night   22
2         0           0    Night   22
3         0           0    Night   21
4         0           0      Day   21
5        21          39      Day   20
6        15          29      Day   21

I need to aggregate the ActivitySum column into 12 hour intervals. 我需要将ActivitySum列聚合为12小时间隔。 Using the code below I read in the table, changed the date column to the correct format, and aggregated the data by day. 使用下面的代码,我读了表,将date列更改为正确的格式,并按天汇总了数据。

P01 <- read.csv( "ActDtaP01_ALL_Temp.csv" )
date <- as.Date(P01$GMT_Date, "%m/%d/%Y")
new <- aggregate(P01, by = list(date), mean)

Resulting in this (below). 结果在此(下)。 My specific questions are: 我的具体问题是:

     Group.1 PumaID CollarID Type GMT_Date GMT_Time LMT_Date LMT_Time ActivityX
1 2000-01-01     NA     2905   NA       NA       NA       NA       NA  0.000000
2 2011-02-21     NA     2905   NA       NA       NA       NA       NA  8.727273
3 2011-02-22     NA     2905   NA       NA       NA       NA       NA  0.000000
4 2011-02-23     NA     2905   NA       NA       NA       NA       NA  0.000000
5 2011-02-24     NA     2905   NA       NA       NA       NA       NA  0.000000
6 2011-02-25     NA     2905   NA       NA       NA       NA       NA  0.000000
  ActivityY ActivitySum DayNight       Temp
1  0.000000     0.00000       NA 21.6666667
2  9.060606    17.78788       NA 12.6969697
3  0.000000     0.00000       NA -2.8521127
4  0.000000     0.00000       NA -1.4471831
5  0.000000     0.00000       NA  0.3485915
6  0.000000     0.00000       NA  1.3368421

1) How can I further subset this into 12 hr intervals within each day (24 hr period) resulting something like.. 1)我如何才能在每天(24小时周期)内将其进一步细分为12小时间隔,从而得到类似结果。

Group.1    Group.2  PumaID  CollarID    etc…
2/21/2011   Day      P01       …    
2/21/2011   Night    P01       …    
2/22/2011   Day      P01       …    
2/22/2011   Night    P01      

2) How do I keep all the column values in the data table rather then returning an NA if the FUN argument (mean in this case) could not be computed? 2)如果无法计算FUN参数(在这种情况下为均值),如何将所有列值保留在数据表中,而不是返回NA?

Thanks again! 再次感谢!

You don't have a particularly good dataset to test code since you have only am cases for the first data and pm cases for the second, but this will do a two-way classification by date and am/pm indicator in the time column. 您没有特别好的数据集来测试代码,因为您只有第一个数据的案例和第二个案例的pm案例,但这将在时间列中按日期和am / pm指示符进行双向分类。 I also removed all the non-numeric columns from consideration, since it makes no sense to ask for the mean of a factor. 我也从考虑中删除了所有非数字列,因为要求一个因子的平均值是没有意义的。

 new2 <- aggregate(dat[unlist(lapply(dat, is.numeric))], 
                by = list(date, gsub("^.+ ", "", dat$GMT_Time) ), mean)
new2
     Group.1 Group.2 CollarID ActivityX ActivityY ActivitySum     Temp
1 2000-01-01      AM     2905   0.00000         0     0.00000 21.66667
2 2011-02-21      PM     2905  10.66667        12    22.66667 20.66667

The gsub call is removing any character between the beginning of the string and the last instance of a space. gsub调用将删除字符串开头和空格的最后一个实例之间的任何字符。 Your second request might be best accomplished by adding ID variables to the by list. 通过将ID变量添加到by列表中,可能最好地完成您的第二个请求。

> new <- aggregate(dat[unlist(lapply(dat, is.numeric))],
                   by = list(Date=date, 
                             AMPM= gsub("^.+ ", "", dat$GMT_Time),
                             Type=dat$Type ), mean)
> new
        Date AMPM  Type CollarID ActivityX ActivityY ActivitySum     Temp
1 2000-01-01   AM Argos     2905   0.00000         0     0.00000 21.66667
2 2011-02-21   PM Argos     2905  10.66667        12    22.66667 20.66667

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM