I have a massive data set consisting of daily returns of 500 stocks over 34 years. I first ran ddply to create yearly median and return columns:
annual <- ddply(data, c("TICKER", "year"), summarize,
median_data = median(RETX),
return = prod(1 + RET))
The data currently looks like this:
TICKER year median_data return
1 A 2000 -0.0081645 0.6717770
2 A 2001 -0.0036845 0.5207290
3 A 2002 -0.0069040 0.6299523
4 A 2003 0.0036585 1.6280659
5 A 2004 0.0000120 0.8242153
6 A 2005 0.0004025 1.3813425
Now I would like to create a new column that contains the mean of median_data for each ticker for the past two years:
TICKER year median_data return avg_median
1 A 2000 -0.0081645 0.6717770 NA
2 A 2001 -0.0036845 0.5207290 -0.0036845
3 A 2002 -0.0069040 0.6299523 -0.0105885
4 A 2003 0.0036585 1.6280659 ...
5 A 2004 0.0000120 0.8242153
6 A 2005 0.0004025 1.3813425
Any help on this would be greatly appreciated!
dplyr
solution: For completeness+correctness, here is the dplyr
way since there is a dplyr tag to this question. Unless I am missing something dvdkamp's solution only works if you have one stock.
df <- expand.grid(
year = 1980:2014,
TICKER = paste0(expand.grid(letters,letters)[1:500,1],
expand.grid(letters,letters)[1:500,2])
)
df$median_data <- rnorm(1:500)
df <- df[,c(2,1,3)]
looks like this:
TICKER year median_data
1 aa 1980 0.5734215
2 aa 1981 1.2102109
3 aa 1982 0.8643419
4 aa 1983 0.7645975
5 aa 1984 0.4004396
6 aa 1985 -1.0195817
by_ticker <- df %>% group_by(TICKER)
lag()
to generate means: mean of this year and last's. Note the default lag(,n=1)
(last 2 years inclusive)
by_ticker %>%
mutate(mean_last2y_incl = ( median_data + lag(median_data) ) / 2 )
mean of this last year and the year before that. (last 2 years exclusive)
by_ticker %>%
mutate(mean_last2y_excl = ( median_data + lag(median_data, n=2) ) / 2 )
see: http://cran.rstudio.com/web/packages/dplyr/vignettes/window-functions.html for more.
try
window_size <- 2 # number of years to average over
data$avg_median <- filter(data$median_data,
rep(1,window_size)/window_size, ## filter coefficients (1/2, 1/2)
sides = 1) ## do the average for years before and including this year
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.