简体   繁体   English

将R中的亚小时数据转换为小时和舍入时间

[英]Convert sub-hourly data to hourly and round up time in R

I have a very big dataframe in R, containing weather data with the following format. 我在R中有一个非常大的数据框,其中包含以下格式的天气数据。

                 valid temp
    1 17/08/2014 00:20   14
    2 17/08/2014 00:50   14
    3 17/08/2014 01:20   13.5
    4 17/08/2014 01:50   13
    5 17/08/2014 02:20   12
    6 17/08/2014 02:50   10

I would like to convert these sub-hourly data to hourly, like the following. 我想将这些每小时的小时数据转换为每小时的数据,如下所示。

                    valid tmpc
    1 2014-08-17 00:00:00   14
    2 2014-08-17 01:00:00   13.75
    3 2014-08-17 02:00:00   12.5

The class of df$valid is 'factor'. df $ valid的类为“ factor”。 I have tried first converting them to Date through POSIXct, but it gives only NA values. 我尝试过先通过POSIXct将它们转换为Date,但是它仅给出NA值。 I have also tried changing the system locale and still I get NAs. 我也尝试过更改系统区域设置,但仍然得到NA。

Option 1: The lubridate solution using ceiling_date or round_date . 选项1:使用ceiling_dateround_datelubridate解决方案。 It's not clear according to your data frame and results if what you want is to round or ceiling. 根据您的数据框和结果不清楚,您想要的是圆形还是天花板形。 For instance, in the first row you are rounding and in the third using ceiling. 例如,在第一行中是四舍五入,在第三行中是使用上限。 Anyways here the example: 无论如何这里的例子:

library(lubridate)
df <- data.frame(i = 1, valid= "17/08/2014 01:28", temp = 14)
df$valid <- dmy_hm(df$valid)
df$valid_round <- ceiling_date(df$valid , unit="hours")

Results: 结果:

  i               valid temp         valid_round
1 1 2014-08-17 01:28:00   14 2014-08-17 02:00:00

Option 2: using the base functions. 选项2:使用base功能。 Use: df$valid <- as.POSIXct(strptime(df$valid, "%d/%m/%Y %H:%M", tz ="UTC")) and then round it. 使用:df $ valid <-as.POSIXct(strptime(df $ valid,“%d /%m /%Y%H:%M”,tz =“ UTC”)),然后将其取整。

We can do this in base R by converting to POSIXlt , set the minute to 0, convert it back to POSIXct and aggregate to get the mean of 'temp' 我们可以通过将base R转换为POSIXlt ,将minute设置为0,再将其转换回POSIXct并进行aggregate以获得“ temp”的mean来完成此操作

df1$valid <- strptime(df1$valid, "%d/%m/%Y %H:%M")
df1$valid$min <- 0
df1$valid <- as.POSIXct(df1$valid)
aggregate(temp~valid, df1, FUN = mean)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM