簡體   English   中英

r繪制時間序列以求和多個變量

[英]r plot time-series for summed up multiple variables

這是我第一次嘗試使用時間序列圖。 我有一個約有5萬行的數據集,其多年結構如下。

Year    expense_1   expense_2   expense_3   expense_4
1999    5           NA          NA          31.82
2000    2           NA          NA          4.75
1999    10.49       NA          NA          NA
2000    39.69       NA          NA          NA
2000    NA          NA          10.61       NA
1999    8.08        NA          NA          NA
2000    16          NA          NA          NA
1999    9.32        NA          NA          NA
1999    9.35        NA          NA          NA

現在,我想在X軸上繪制YearY的時間序列,在Y軸上 expense_1 Expense時間expense_1expense_2不同的線分別是expense_1expense_2expense_3expense_4 每個類別的費用應按年度匯總,並且不NA

你可以計算sum使用summarise_all那么你的數據轉換為長格式,使得它更容易使用繪制ggplot

library(tidyverse)
library(scales)

df <- read.table(text = "Year    expense_1   expense_2   expense_3   expense_4
1999    5           NA          NA          31.82
                 2000    2           NA          NA          4.75
                 1999    10.49       NA          NA          NA
                 2000    39.69       NA          NA          NA
                 2000    NA          NA          10.61       NA
                 1999    8.08        NA          NA          NA
                 2000    16          NA          NA          NA
                 1999    9.32        NA          NA          NA
                 1999    9.35        NA          NA          NA",
                 header = TRUE, stringsAsFactors = FALSE)

# define summation function that returns NA if all values are NA
# By default, R returns 0 if all values are NA
sum_NA <- function(x) {
  if(all(is.na(x))) NA_integer_ else sum(x, na.rm = TRUE)
} 

df_long <- df %>% 
  group_by(Year) %>% 
  summarise_all(funs(sum_NA(.))) %>% 
  gather(key = "type", value = "expense", -Year)
df_long

#> # A tibble: 8 x 3
#>    Year type      expense
#>   <int> <chr>       <dbl>
#> 1  1999 expense_1   42.2 
#> 2  2000 expense_1   57.7 
#> 3  1999 expense_2   NA   
#> 4  2000 expense_2   NA   
#> 5  1999 expense_3   NA   
#> 6  2000 expense_3   10.6 
#> 7  1999 expense_4   31.8 
#> 8  2000 expense_4    4.75

ggplot(df_long, aes(x = Year, y = expense, color = type, group = type)) +
  geom_point() +
  geom_line() +
  scale_x_continuous(breaks = scales::pretty_breaks(n = 1)) +
  theme_bw()

reprex軟件包 (v0.2.0)創建於2018-05-21。

您可以讓ggplot為您完成大部分工作-只需gather ,然后開始繪圖即可:

df %>%
  gather(expense, value, -Year) %>%
  ggplot(aes(x=Year, y=value, color=expense)) +
  geom_line(stat="summary", fun.y="sum")

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM