简体   繁体   English

使用 R 将时间序列数据划分为工作日和周末数据集

[英]Divide time-series data into weekday and weekend datasets using R

I have dataset consisting of two columns (timestamp and power) as:我有由两列(时间戳和功率)组成的数据集,如下所示:

 str(df2)
'data.frame':   720 obs. of  2 variables:
 $ timestamp: POSIXct, format: "2015-08-01 00:00:00" "2015-08-01 01:00:00" " ...
 $ power    : num  124 149 118 167 130 ..

This dataset is of entire one month duration.该数据集的持续时间为整整一个月。 I want to create two subsets of it - one containing the weekend data, and other one containing weekday (Monday - Friday) data.我想创建它的两个子集 - 一个包含周末数据,另一个包含工作日(星期一 - 星期五)数据。 In other words, one dataset should contain data corresponding to saturday and sunday and the other one should contain data of other days.换句话说,一个数据集应该包含对应于周六和周日的数据,而另一个数据集应该包含其他日期的数据。 Both of the subsets should retain both of the columns.两个子集都应保留两列。 How can I do this in R?我怎样才能在 R 中做到这一点?

I tried to use the concept of aggregate and split, but I am not clear in the function parameter (FUN) of aggregate, how should I specify a divison of dataset.我尝试使用聚合和拆分的概念,但我不清楚聚合的函数参数(FUN)中,我应该如何指定数据集的划分。

You can use R base functions to do this, first use strptime to separate date data from first column and then use function weekdays .可以使用R基本函数来做到这一点,首先使用strptime从第一塔分开的日期数据,然后使用函数weekdays Example:例子:

df1<-data.frame(timestamp=c("2015-08-01 00:00:00","2015-10-13 00:00:00"),power=1:2)
df1$day<-strptime(df1[,1], "%Y-%m-%d")
df1$weekday<-weekdays(df1$day)
df1
 timestamp              power   day      weekday
 2015-08-01 00:00:00     1   2015-08-01  Saturday
 2015-10-13 00:00:00     2   2015-10-13  Tuesday

Initially, I tried for complex approaches using extra libraries, but at the end, I came out with a basic approach using R.最初,我尝试使用额外的库来实现复杂的方法,但最后,我提出了使用 R 的基本方法。

    #adding day column to existing set 
    df2$day <- weekdays(as.POSIXct(df2$timestamp))    
    # creating two data_subsets, i.e., week_data and weekend_data
    week_data<- data.frame(timestamp=factor(), power= numeric(),day= character())
    weekend_data<- data.frame(timestamp=factor(),power=numeric(),day= character())
    #Specifying weekend days in vector, weekend
    weekend <- c("Saturday","Sunday")
    for(i in 1:nrow(df2)){
      if(is.element(df2[i,3], weekend)){
        weekend_data <- rbind(weekend_data, df2[i,])
      } else{
        week_data <- rbind(week_data, df2[i,])
      }
    }

The datasets created, ie, weekend_data and week_data are my required sub datasets.创建的数据集,即weekend_data 和week_data 是我需要的子数据集。

Building on top of @ShruS example:建立在@ShruS 示例之上:

df<-data.frame(timestamp=c("2015-08-01 00:00:00","2015-10-13 00:00:00", "2015-10-11 00:00:00", "2015-10-14 00:00:00"))
df$day<-strptime(df[,1], "%Y-%m-%d")
df$weekday<-weekdays(df$day)
df1 = subset(df,df$weekday == "Saturday" | df$weekday == "Sunday")
df2 = subset(df,df$weekday != "Saturday" & df$weekday != "Sunday")

> df
            timestamp        day   weekday
1 2015-08-01 00:00:00 2015-08-01  Saturday
2 2015-10-13 00:00:00 2015-10-13   Tuesday
3 2015-10-11 00:00:00 2015-10-11    Sunday
4 2015-10-14 00:00:00 2015-10-14 Wednesday

> df1
            timestamp        day  weekday
1 2015-08-01 00:00:00 2015-08-01 Saturday
3 2015-10-11 00:00:00 2015-10-11   Sunday

> df2
            timestamp        day   weekday
2 2015-10-13 00:00:00 2015-10-13   Tuesday
4 2015-10-14 00:00:00 2015-10-14 Wednesday

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM