简体   繁体   中英

Applying function to a subset of xts quantmod

I'm trying to get the standard deviation of a stock price by year, but I'm getting the same value for every year.

I tried with dplyr (group_by, summarise) and also with a function, but had no luck in any of them, both return the same value of 67.0.

It is probably passing the whole dataframe without subsetting it, how can this issue be fixed?

library(quantmod)
library(tidyr)
library(dplyr)

#initial parameters
initialDate = as.Date('2010-01-01')
finalDate = Sys.Date()

ybeg = format(initialDate,"%Y")
yend = format(finalDate,"%Y")

ticker = "AAPL"

#getting stock prices
stock = getSymbols.yahoo(ticker, from=initialDate, auto.assign = FALSE)
stock = stock[,4] #working only with closing prices

With dplyr:

#Attempt 1 with dplyr - not working, all values by year return the same

stock = stock %>% zoo::fortify.zoo()
stock$Date = stock$Index
separate(stock, Date, c("year","month","day"), sep="-") %>% 
   group_by(year) %>%
   summarise(stdev= sd(stock[,2]))

# A tibble: 11 x 2
#   year  stdev
#   <chr> <dbl>
# 1 2010   67.0
# 2 2011   67.0
#....
#10 2019   67.0
#11 2020   67.0

And with function:

#Attempt 2 with function - not working - returns only one value instead of multiple

#getting stock prices
stock = getSymbols.yahoo(ticker, from=initialDate, auto.assign = FALSE)
stock = stock[,4] #working only with closing prices

#subsetting
years = as.character(seq(ybeg,yend,by=1))
years

calculate_stdev = function(series,years) {
  series[years] #subsetting by years, to be equivalent as stock["2010"], stock["2011"] e.g.
  sd(series[years][,1]) #calculate stdev on closing prices of the current subset
}

yearly.stdev = calculate_stdev(stock,years)

> yearly.stdev
[1] 67.04185

I don't know dplyr , but here's how with data.table

library(data.table)

# convert data.frame to data.table
setDT(stock)

# convert your Date column with content like "2020-06-17" from character to Date type
stock[,Date:=as.Date(Date)]

# calculate sd(price) grouped by year, assuming here your price column is named "price"
stock[,sd(price),year(Date)]

Don't pass the name of the dataframe again in your summarise function. Use the variable name instead.

separate(stock, Date, c("year","month","day"), sep="-") %>% 
  group_by(year) %>% 
  summarise(stdev = sd(AAPL.Close)) # <-- here
# A tibble: 11 x 2
#   year  stdev
#   <chr> <dbl>
# 1 2010   5.37
# 2 2011   3.70
# 3 2012   9.57
# 4 2013   6.41
# 5 2014  13.4 
# 6 2015   7.68
# 7 2016   7.64
# 8 2017  14.6 
# 9 2018  20.6 
#10 2019  34.5 
#11 2020  28.7 

Use apply.yearly() (a convenience wrapper around the more general period.apply() ) to call a function on yearly subsets of the xts object returned by getSymbols() .

You can use the Cl() function to extract the close column from objects returned by getSymbols() .

stock = getSymbols("AAPL", from = "2010-01-01", auto.assign = FALSE)
apply.yearly(Cl(stock), sd)
##            AAPL.Close
## 2010-12-31   5.365208
## 2011-12-30   3.703407
## 2012-12-31   9.568127
## 2013-12-31   6.412542
## 2014-12-31  13.371293
## 2015-12-31   7.683550
## 2016-12-30   7.640743
## 2017-12-29  14.621191
## 2018-12-31  20.593861
## 2019-12-31  34.538978
## 2020-06-19  29.577157

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM