[英]Plot time series of different years together
我正在嘗試比較不同年份的變量,但無法將它們繪制在一起。 時間序列是一個溫度序列,可以在https://github.com/gonzalodqa/timeseries中找到,作為 temp.csv 我想要 plot 像圖像一樣的東西,但我發現很難將年份之間的月份子集然后在相同的月份合並相同的 plot 中的行
如果有人可以給我一些建議或指出正確的方向,我將不勝感激
你可以試試這個方法。
第一個圖表顯示所有可用溫度,第二個圖表按月匯總。
在第一個圖表中,我們強制使用同一年,以便ggplot
將它們對齊繪制,但我們按顏色分隔線。
對於第二個,我們只使用month
作為x
變量和year
作為colour
變量。
注意:
scale_x_datetime
我們可以隱藏年份,這樣就沒有人可以看到我們將 2020 年強制到每一次觀察scale_x_continous
我們可以顯示月份的名稱而不是數字 [嘗試使用和不使用scale_x_...
來運行圖表scale_x_...
以了解我在說什么]
month.abb
是月份名稱的有用默認變量。
# read data
df <- readr::read_csv2("https://raw.githubusercontent.com/gonzalodqa/timeseries/main/temp.csv")
# libraries
library(ggplot2)
library(dplyr)
# line chart by datetime
df %>%
# make datetime: force unique year
mutate(datetime = lubridate::make_datetime(2020, month, day, hour, minute, second)) %>%
ggplot() +
geom_line(aes(x = datetime, y = T42, colour = factor(year))) +
scale_x_datetime(breaks = lubridate::make_datetime(2020,1:12), labels = month.abb) +
labs(title = "Temperature by Datetime", colour = "Year")
# line chart by month
df %>%
# average by year-month
group_by(year, month) %>%
summarise(T42 = mean(T42, na.rm = TRUE), .groups = "drop") %>%
ggplot() +
geom_line(aes(x = month, y = T42, colour = factor(year))) +
scale_x_continuous(breaks = 1:12, labels = month.abb, minor_breaks = NULL) +
labs(title = "Average Temperature by Month", colour = "Year")
如果您希望圖表從 7 月開始,您可以使用以下代碼:
months_order <- c(7:12,1:6)
# line chart by month
df %>%
# average by year-month
group_by(year, month) %>%
summarise(T42 = mean(T42, na.rm = TRUE), .groups = "drop") %>%
# create new groups starting from each July
group_by(neworder = cumsum(month == 7)) %>%
# keep only complete years
filter(n() == 12) %>%
# give new names to groups
mutate(years = paste(unique(year), collapse = " / ")) %>%
ungroup() %>%
# reorder months
mutate(month = factor(month, levels = months_order, labels = month.abb[months_order], ordered = TRUE)) %>%
# plot
ggplot() +
geom_line(aes(x = month, y = T42, colour = years, group = years)) +
labs(title = "Average Temperature by Month", colour = "Year")
要以不同的方式訂購月份並總結幾年的值,您必須在繪制數據之前對數據進行一些處理:
library(dplyr) # work data
library(ggplot2) # plots
library(lubridate) # date
library(readr) # fetch data
# your data
df <- read_csv2("https://raw.githubusercontent.com/gonzalodqa/timeseries/main/temp.csv")
df %>%
mutate(date = make_date(year, month,day)) %>%
# reorder month
group_by(month_2 = factor(as.character(month(date, label = T, locale = Sys.setlocale("LC_TIME", "English"))),
levels = c('Jul','Aug','Sep','Oct','Nov','Dec','Jan','Feb','Mar','Apr','May','Jun')),
# group years as you like
year_2 = ifelse( year(date) %in% (2018:2019), '2018/2019', '2020/2021')) %>%
# you can put whatever aggregation function you need
summarise(val = mean(T42, na.rm = T)) %>%
# plot it!
ggplot(aes(x = month_2, y = val, color = year_2, group = year_2)) +
geom_line() +
ylab('T42') +
xlab('month') +
theme_light()
一個略有不同的解決方案,沒有所有日期到 2020 年的技巧。
library(tidyverse)
library(lubridate)
df <- read_csv2("https://raw.githubusercontent.com/gonzalodqa/timeseries/main/temp.csv")
df <- df |>
filter(year %in% c(2018, 2019, 2020)) %>%
mutate(year = factor(year),
month = ifelse(month<10, paste0(0,month), month),
day = paste0(0, day),
month_day = paste0(month, "-", day))
df |> ggplot(aes(x=month_day, y=T42, group=year, col=year)) +
geom_line() +
scale_x_discrete(breaks = c("01-01", "02-01", "03-01", "04-01", "05-01", "06-01", "07-01", "08-01", "09-01", "10-01", "11-01", "12-01"))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.