[英]R time series from day to month
我是 R 的新手,我需要有关 R 时间序列的帮助。 我有想要总结一个月的每日数据。 我如何将天数总结为一个月? 最后,我想使用 TS 作为每月频率。
library(data.table)
#read csv "ProductList-B1-134564"
coffeedata = fread("https://raw.githubusercontent.com/Skruff80/FFHS/master/KantineWII/ProductList-B1-134564_w3_original.csv")
#Formating date
coffeedata$Date = as.Date(coffeedata$Date, "%d.%m.%Y")
#dailycounter
countcoffee = function(timeStamps) {
Dates = as.Date(strftime(coffeedata$Date, "%Y-%m-%d"))
allDates = seq(from = min(Dates), to = max(Dates), by = "day")
coffee.count = sapply(allDates, FUN = function(X) sum(Dates == X))
data.frame(day = allDates, coffee.count = coffee.count)}
dailycounter = countcoffee(df$coffee.date)
#time series
demand <- ts(dailycounter$coffee.count, start = c(2018, 1), frequency = 365)
plot(demand, type="l", xlab = "Year", col = "orange", lwd = "1", ylab = "Products a day")
我感谢您的帮助。
您可以按月分组并总结数月的每日计数器,然后将 model 作为时间序列,如下所示:
library(data.table)
library(tidyverse)
#read csv "ProductList-B1-134564"
coffeedata = fread("https://raw.githubusercontent.com/Skruff80/FFHS/master/KantineWII/ProductList-B1-134564_w3_original.csv")
#Formating date
coffeedata$Date = as.Date(coffeedata$Date, "%d.%m.%Y")
#dailycounter
countcoffee = function(timeStamps) {
Dates = as.Date(strftime(coffeedata$Date, "%Y-%m-%d"))
allDates = seq(from = min(Dates), to = max(Dates), by = "day")
coffee.count = sapply(allDates, FUN = function(X) sum(Dates == X))
data.frame(day = allDates, coffee.count = coffee.count)}
dailycounter = countcoffee(df$coffee.date)
monthlycounter <- dailycounter %>%
group_by(month(ymd(day))) %>%
summarise(coffe.count_month = sum(coffee.count))
demand <- ts(monthlycounter$coffe.count_month, start = c(2018, 1), frequency = 12)
plot(demand, type="l", xlab = "Year", col = "orange", lwd = "1", ylab = "Products per month")
您可以使用其他一些软件包来简化您的工作。 我建议您为此安装tidyverse
,它附带了许多有用的软件包。 然后,操作、分组和可视化数据变得更加容易和直接。
library(readr) # for reading data
library(dplyr) # for manipulating data
library(lubridate) # for date and time manipulation
library(ggplot2) # for graphics
# load data
coffee <- read_csv2("https://raw.githubusercontent.com/Skruff80/FFHS/master/KantineWII/ProductList-B1-134564_w3_original.csv")
# make a date
coffee$Date <- parse_date_time(coffee$Date, "d.m.Y H:M")
# make a plot from month data:
# 1. group data by month
# 2. count # of coffees (in that month)
# 3. create a plot
coffee %>%
group_by(Month = floor_date(Date, "month")) %>%
count() %>%
ggplot(aes(Month, n)) + geom_col()
如果你想要一个产品类别的颜色数据,你只需像这样调整这个代码:
coffee %>%
group_by(Month = floor_date(Date, "month"), Product_Category) %>%
count() %>%
ggplot(aes(Month, n, fill = Product_Category)) + geom_col()
请注意,为简单起见,我没有命名轴等,但希望这段代码对您有所帮助。
既然您已经使用fread()
以data.table
格式读取数据,为什么不坚持使用data.table
呢?
代码
#set Date as posixCT
coffeedata[, Date := as.POSIXct( Date, format = "%d.%m.%Y %H:%M" ) ]
#summarise to monthly total of lines
coffeedata[, .( total = .N ), by = .( format( Date, "%Y-%m" ) )]
output
# format total
# 1: 2020-05 1223
# 2: 2020-04 1666
# 3: 2020-03 2174
# 4: 2020-02 2660
# 5: 2020-01 2948
# 6: 2019-12 2127
# 7: 2019-11 2921
# 8: 2019-10 2954
# 9: 2019-09 2827
# 10: 2019-08 2310
# 11: 2019-07 2940
# 12: 2019-06 2544
# 13: 2019-05 2984
# 14: 2019-04 2676
# 15: 2019-03 2810
# 16: 2019-02 2660
# 17: 2019-01 2948
# 18: 2018-12 2127
# 19: 2018-11 2921
# 20: 2018-10 3291
# 21: 2018-09 2020
# 22: 2018-08 1771
# 23: 2018-07 2576
# 24: 2018-06 2569
# 25: 2018-05 2553
# 26: 2018-04 1990
# 27: 2018-03 2472
# 28: 2018-02 2091
# 29: 2018-01 892
# format total
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.