[英]Split time series data based on months or quarters for comparing mean and variance values
這是解決上述問題的擴展問題。
我有一個來自 Yahoo Finance 的數據集,其中涵蓋了大約兩年的 Apple 每日股票數據。 現在我想根據月份或季度拆分數據集,以便比較均值和方差值(檢查平穩性)
這是數據集:
Date Adj.Close
1 2018-01-02 41.38024
2 2018-01-03 41.37303
3 2018-01-04 41.56522
4 2018-01-05 42.03845
5 2018-01-08 41.88231
6 2018-01-09 41.87751
7 2018-01-10 41.86789
8 2018-01-11 42.10571
9 2018-01-12 42.54050
10 2018-01-16 42.32431
11 2018-01-17 43.02335
12 2018-01-18 43.06179
13 2018-01-19 42.86961
14 2018-01-22 42.51889
15 2018-01-23 42.52850
16 2018-01-24 41.85107
17 2018-01-25 41.10399
18 2018-01-26 41.20008
19 2018-01-29 40.34730
20 2018-01-30 40.10948
21 2018-01-31 40.21999
22 2018-02-01 40.30407
23 2018-02-02 38.55526
24 2018-02-05 37.59198
我該怎么做? 謝謝!
我們可以使用zoo
中的as.yearmon/as.yearqtr
對它們進行分組並獲取匯總統計信息
library(dplyr)
library(zoo)
df %>%
group_by(yearmonth = as.yearmon(Date)) %>%
summarise(startDate = min(Date),
avg = mean(Adj.Close), sd = sd(Adj.Close), .groups = 'drop') %>%
select(-yearmonth)
或'yearqtr'
df %>%
group_by(yearqtr = as.yearqtr(Date)) %>%
summarise(startDate = min(Date),
avg = mean(Adj.Close), sd = sd(Adj.Close), .groups = 'drop') %>%
select(-yearqtr)
# A tibble: 1 x 3
# startDate avg sd
# <date> <dbl> <dbl>
#1 2018-01-02 41.4 1.35
df <- structure(list(Date = structure(c(17533, 17534, 17535, 17536,
17539, 17540, 17541, 17542, 17543, 17547, 17548, 17549, 17550,
17553, 17554, 17555, 17556, 17557, 17560, 17561, 17562, 17563,
17564, 17567), class = "Date"), Adj.Close = c(41.38024, 41.37303,
41.56522, 42.03845, 41.88231, 41.87751, 41.86789, 42.10571, 42.5405,
42.32431, 43.02335, 43.06179, 42.86961, 42.51889, 42.5285, 41.85107,
41.10399, 41.20008, 40.3473, 40.10948, 40.21999, 40.30407, 38.55526,
37.59198)), row.names = c("1", "2", "3", "4", "5", "6", "7",
"8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18",
"19", "20", "21", "22", "23", "24"), class = "data.frame")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.