簡體   English   中英

Plot 不同年份的時間序列在一起

[英]Plot time series of different years together

我正在嘗試比較不同年份的變量,但無法將它們繪制在一起。 時間序列是一個溫度序列,可以在https://github.com/gonzalodqa/timeseries中找到,作為 temp.csv 我想要 plot 像圖像一樣的東西,但我發現很難將年份之間的月份子集然后在相同的月份合並相同的 plot 中的行在此處輸入圖像描述

如果有人可以給我一些建議或指出正確的方向,我將不勝感激

你可以試試這個方法。

第一個圖表顯示所有可用溫度,第二個圖表按月匯總。

在第一個圖表中,我們強制使用同一年,以便ggplot將它們對齊繪制,但我們按顏色分隔線。

對於第二個,我們只使用month作為x變量和year作為colour變量。

注意:

  • 使用scale_x_datetime我們可以隱藏年份,這樣就沒有人可以看到我們將 2020 年強制到每一次觀察
  • 使用scale_x_continous我們可以顯示月份的名稱而不是數字

[嘗試使用和不使用scale_x_...來運行圖表scale_x_...以了解我在說什么]

month.abb是月份名稱的有用默認變量。

# read data
df <- readr::read_csv2("https://raw.githubusercontent.com/gonzalodqa/timeseries/main/temp.csv")


# libraries
library(ggplot2)
library(dplyr)


# line chart by datetime
df %>% 
  # make datetime: force unique year
  mutate(datetime = lubridate::make_datetime(2020, month, day, hour, minute, second)) %>% 
  
  ggplot() +
  geom_line(aes(x = datetime, y = T42, colour = factor(year))) +
  scale_x_datetime(breaks = lubridate::make_datetime(2020,1:12), labels = month.abb) +
  labs(title = "Temperature by Datetime", colour = "Year")

# line chart by month
df %>% 
  
  # average by year-month
  group_by(year, month) %>% 
  summarise(T42 = mean(T42, na.rm = TRUE), .groups = "drop") %>% 
  
  ggplot() +
  geom_line(aes(x = month, y = T42, colour = factor(year))) +
  scale_x_continuous(breaks = 1:12, labels = month.abb, minor_breaks = NULL) +
  labs(title = "Average Temperature by Month", colour = "Year")


如果您希望圖表從 7 月開始,您可以使用以下代碼:

months_order <- c(7:12,1:6)

# line chart by month
df %>% 
  
  # average by year-month
  group_by(year, month) %>% 
  summarise(T42 = mean(T42, na.rm = TRUE), .groups = "drop") %>% 
    
  # create new groups starting from each July
  group_by(neworder = cumsum(month == 7)) %>% 
    
  # keep only complete years
  filter(n() == 12) %>% 
    
  # give new names to groups
  mutate(years = paste(unique(year), collapse = " / ")) %>% 
  ungroup() %>% 
  
  # reorder months
  mutate(month = factor(month, levels = months_order, labels = month.abb[months_order], ordered = TRUE)) %>% 
      
  # plot
  ggplot() +
  geom_line(aes(x = month, y = T42, colour = years, group = years)) +
  labs(title = "Average Temperature by Month", colour = "Year")

要以不同的方式訂購月份並總結幾年的值,您必須在繪制數據之前對數據進行一些處理:

library(dplyr)     # work data
library(ggplot2)   # plots
library(lubridate) # date
library(readr)     # fetch data

# your data
df <- read_csv2("https://raw.githubusercontent.com/gonzalodqa/timeseries/main/temp.csv")


  df %>%
  mutate(date = make_date(year, month,day)) %>%
  # reorder month
  group_by(month_2 = factor(as.character(month(date, label = T, locale = Sys.setlocale("LC_TIME", "English"))),
                            levels = c('Jul','Aug','Sep','Oct','Nov','Dec','Jan','Feb','Mar','Apr','May','Jun')),
           # group years as you like
           year_2   = ifelse( year(date) %in% (2018:2019), '2018/2019', '2020/2021')) %>%
  # you can put whatever aggregation function you need
  summarise(val = mean(T42, na.rm = T)) %>%
  # plot it!
  ggplot(aes(x = month_2, y = val, color = year_2, group = year_2)) + 
  geom_line()   + 
  ylab('T42')   +
  xlab('month') + 
  theme_light()

在此處輸入圖片說明

一個略有不同的解決方案,沒有所有日期到 2020 年的技巧。

library(tidyverse)
library(lubridate)
df <- read_csv2("https://raw.githubusercontent.com/gonzalodqa/timeseries/main/temp.csv")
df <- df |>
  filter(year %in% c(2018, 2019, 2020)) %>%
  mutate(year = factor(year),
         month = ifelse(month<10, paste0(0,month), month),
         day = paste0(0, day),
         month_day = paste0(month, "-", day))
df |> ggplot(aes(x=month_day, y=T42, group=year, col=year)) +
        geom_line() +
        scale_x_discrete(breaks = c("01-01", "02-01", "03-01", "04-01", "05-01", "06-01", "07-01", "08-01", "09-01", "10-01", "11-01", "12-01"))
 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM