繁体   English   中英

R 从日到月的时间序列

[英]R time series from day to month

我是 R 的新手,我需要有关 R 时间序列的帮助。 我有想要总结一个月的每日数据。 我如何将天数总结为一个月? 最后,我想使用 TS 作为每月频率。

library(data.table)

#read csv "ProductList-B1-134564"
coffeedata = fread("https://raw.githubusercontent.com/Skruff80/FFHS/master/KantineWII/ProductList-B1-134564_w3_original.csv")

#Formating date
coffeedata$Date = as.Date(coffeedata$Date, "%d.%m.%Y")

#dailycounter 
countcoffee = function(timeStamps) {
Dates = as.Date(strftime(coffeedata$Date, "%Y-%m-%d"))
allDates = seq(from = min(Dates), to = max(Dates), by = "day")
coffee.count = sapply(allDates, FUN = function(X) sum(Dates == X))
data.frame(day = allDates, coffee.count = coffee.count)}
dailycounter = countcoffee(df$coffee.date)

#time series
demand <- ts(dailycounter$coffee.count, start = c(2018, 1), frequency = 365)
plot(demand, type="l", xlab = "Year", col = "orange", lwd = "1", ylab = "Products a day")

我感谢您的帮助。

您可以按月分组并总结数月的每日计数器,然后将 model 作为时间序列,如下所示:

library(data.table)
library(tidyverse)

#read csv "ProductList-B1-134564"
coffeedata = fread("https://raw.githubusercontent.com/Skruff80/FFHS/master/KantineWII/ProductList-B1-134564_w3_original.csv")

#Formating date
coffeedata$Date = as.Date(coffeedata$Date, "%d.%m.%Y")

#dailycounter 
countcoffee = function(timeStamps) {
  Dates = as.Date(strftime(coffeedata$Date, "%Y-%m-%d"))
  allDates = seq(from = min(Dates), to = max(Dates), by = "day")
  coffee.count = sapply(allDates, FUN = function(X) sum(Dates == X))
  data.frame(day = allDates, coffee.count = coffee.count)}
dailycounter = countcoffee(df$coffee.date)

monthlycounter <- dailycounter %>% 
  group_by(month(ymd(day))) %>% 
  summarise(coffe.count_month = sum(coffee.count))

demand <- ts(monthlycounter$coffe.count_month, start = c(2018, 1), frequency = 12)
plot(demand, type="l", xlab = "Year", col = "orange", lwd = "1", ylab = "Products per month")

您可以使用其他一些软件包来简化您的工作。 我建议您为此安装tidyverse ,它附带了许多有用的软件包。 然后,操作、分组和可视化数据变得更加容易和直接。

library(readr)     # for reading data
library(dplyr)     # for manipulating data
library(lubridate) # for date and time manipulation
library(ggplot2)   # for graphics

# load data
coffee <- read_csv2("https://raw.githubusercontent.com/Skruff80/FFHS/master/KantineWII/ProductList-B1-134564_w3_original.csv")

# make a date
coffee$Date <- parse_date_time(coffee$Date, "d.m.Y H:M")

# make a plot from month data:
# 1. group data by month
# 2. count # of coffees (in that month)
# 3. create a plot
coffee %>%
  group_by(Month = floor_date(Date, "month")) %>%
  count() %>%
  ggplot(aes(Month, n)) + geom_col()

如果你想要一个产品类别的颜色数据,你只需像这样调整这个代码:

coffee %>%
  group_by(Month = floor_date(Date, "month"), Product_Category) %>%
  count() %>%
  ggplot(aes(Month, n, fill = Product_Category)) + geom_col()

请注意,为简单起见,我没有命名轴等,但希望这段代码对您有所帮助。

既然您已经使用fread()data.table格式读取数据,为什么不坚持使用data.table呢?

代码

#set Date as posixCT
coffeedata[, Date := as.POSIXct( Date, format = "%d.%m.%Y %H:%M" ) ]
#summarise to monthly total of lines
coffeedata[, .( total = .N ), by = .( format( Date, "%Y-%m" ) )]

output

#     format total
# 1: 2020-05  1223
# 2: 2020-04  1666
# 3: 2020-03  2174
# 4: 2020-02  2660
# 5: 2020-01  2948
# 6: 2019-12  2127
# 7: 2019-11  2921
# 8: 2019-10  2954
# 9: 2019-09  2827
# 10: 2019-08  2310
# 11: 2019-07  2940
# 12: 2019-06  2544
# 13: 2019-05  2984
# 14: 2019-04  2676
# 15: 2019-03  2810
# 16: 2019-02  2660
# 17: 2019-01  2948
# 18: 2018-12  2127
# 19: 2018-11  2921
# 20: 2018-10  3291
# 21: 2018-09  2020
# 22: 2018-08  1771
# 23: 2018-07  2576
# 24: 2018-06  2569
# 25: 2018-05  2553
# 26: 2018-04  1990
# 27: 2018-03  2472
# 28: 2018-02  2091
# 29: 2018-01   892
#      format total

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM