简体   繁体   中英

R time series from day to month

I'm new to R and i need help with an R time series. I have daily data that I want to sum up to one month. How can I sum up the days to a month? In the end I want to use the TS as monthly frequency.

library(data.table)

#read csv "ProductList-B1-134564"
coffeedata = fread("https://raw.githubusercontent.com/Skruff80/FFHS/master/KantineWII/ProductList-B1-134564_w3_original.csv")

#Formating date
coffeedata$Date = as.Date(coffeedata$Date, "%d.%m.%Y")

#dailycounter 
countcoffee = function(timeStamps) {
Dates = as.Date(strftime(coffeedata$Date, "%Y-%m-%d"))
allDates = seq(from = min(Dates), to = max(Dates), by = "day")
coffee.count = sapply(allDates, FUN = function(X) sum(Dates == X))
data.frame(day = allDates, coffee.count = coffee.count)}
dailycounter = countcoffee(df$coffee.date)

#time series
demand <- ts(dailycounter$coffee.count, start = c(2018, 1), frequency = 365)
plot(demand, type="l", xlab = "Year", col = "orange", lwd = "1", ylab = "Products a day")

I appreciate your help.

You could group by month and sum up the daily counter over months, then model this as a time series like this:

library(data.table)
library(tidyverse)

#read csv "ProductList-B1-134564"
coffeedata = fread("https://raw.githubusercontent.com/Skruff80/FFHS/master/KantineWII/ProductList-B1-134564_w3_original.csv")

#Formating date
coffeedata$Date = as.Date(coffeedata$Date, "%d.%m.%Y")

#dailycounter 
countcoffee = function(timeStamps) {
  Dates = as.Date(strftime(coffeedata$Date, "%Y-%m-%d"))
  allDates = seq(from = min(Dates), to = max(Dates), by = "day")
  coffee.count = sapply(allDates, FUN = function(X) sum(Dates == X))
  data.frame(day = allDates, coffee.count = coffee.count)}
dailycounter = countcoffee(df$coffee.date)

monthlycounter <- dailycounter %>% 
  group_by(month(ymd(day))) %>% 
  summarise(coffe.count_month = sum(coffee.count))

demand <- ts(monthlycounter$coffe.count_month, start = c(2018, 1), frequency = 12)
plot(demand, type="l", xlab = "Year", col = "orange", lwd = "1", ylab = "Products per month")

You can use some other packages to simplify your work. I recommend you to install tidyverse for this, which comes with a number of useful packages. Then it's much easier and straightforward to manipulate, group and visualize data.

library(readr)     # for reading data
library(dplyr)     # for manipulating data
library(lubridate) # for date and time manipulation
library(ggplot2)   # for graphics

# load data
coffee <- read_csv2("https://raw.githubusercontent.com/Skruff80/FFHS/master/KantineWII/ProductList-B1-134564_w3_original.csv")

# make a date
coffee$Date <- parse_date_time(coffee$Date, "d.m.Y H:M")

# make a plot from month data:
# 1. group data by month
# 2. count # of coffees (in that month)
# 3. create a plot
coffee %>%
  group_by(Month = floor_date(Date, "month")) %>%
  count() %>%
  ggplot(aes(Month, n)) + geom_col()

And if you want eg color data by a product category, you just adjust this code like that:

coffee %>%
  group_by(Month = floor_date(Date, "month"), Product_Category) %>%
  count() %>%
  ggplot(aes(Month, n, fill = Product_Category)) + geom_col()

Note that for simplicity I haven't named axes etc. but hope this pieces of code help you.

Since you already are reading your data in a data.table -format, using fread() , why not stick with data.table ?

code

#set Date as posixCT
coffeedata[, Date := as.POSIXct( Date, format = "%d.%m.%Y %H:%M" ) ]
#summarise to monthly total of lines
coffeedata[, .( total = .N ), by = .( format( Date, "%Y-%m" ) )]

output

#     format total
# 1: 2020-05  1223
# 2: 2020-04  1666
# 3: 2020-03  2174
# 4: 2020-02  2660
# 5: 2020-01  2948
# 6: 2019-12  2127
# 7: 2019-11  2921
# 8: 2019-10  2954
# 9: 2019-09  2827
# 10: 2019-08  2310
# 11: 2019-07  2940
# 12: 2019-06  2544
# 13: 2019-05  2984
# 14: 2019-04  2676
# 15: 2019-03  2810
# 16: 2019-02  2660
# 17: 2019-01  2948
# 18: 2018-12  2127
# 19: 2018-11  2921
# 20: 2018-10  3291
# 21: 2018-09  2020
# 22: 2018-08  1771
# 23: 2018-07  2576
# 24: 2018-06  2569
# 25: 2018-05  2553
# 26: 2018-04  1990
# 27: 2018-03  2472
# 28: 2018-02  2091
# 29: 2018-01   892
#      format total

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM