简体   繁体   中英

quantile & aggregate on zoo object in R

I have a zoo object looking like

library(zoo)
library(lubridate)
TimeStamp=seq(dmy("01/01/2002"), dmy("17/12/2014"), by="day")
Dummy= rnorm(length(TimeStamp))
Temp=zoo(Dummy,TimeStamp)

I am trying to calculate the 5%, 33%, 67% and 95% percentile/quantile for each day in a year to create a "norm". So for the 01/01 I would like to have the 4 quantiles values based on all the observation I have for the 1st of Jan in my Dataset. Same thing for every day.

Right now I am using that:

aggregate(Temp ~ day(index(Temp)) + month(index(Temp)), FUN = 'quantile')

The problem is that using this function I am not sure of what value return the quantile function.

Any suggestion ?

You need to learn how to read the help pages (and you sometimes need to learn which help page to look at as @GGrothendeick just pointed out). I might have (did) thought that the first page would work but I would be wrong:

 ?aggregate.zoo     # 
 ?aggregate.formula # fortunately, they are the same w.r.t the dots-arg

The help page has a Usage section:

## S3 method for class 'zoo'
aggregate(x, by, FUN = sum, ..., regular = NULL, frequency = NULL)
## S3 method for class 'formula'
aggregate(formula, data, FUN, ...,
          subset, na.action = na.omit)

So that ..., will be passing any arguments to the quantile function. (It wasn't clear from the page whether putting in a character name was acceptable, but if you are not getting errors, then you've already tested that. The quantile function:

  ?quantile

... has a Usage section:

## Default S3 method:
quantile(x, probs = seq(0, 1, 0.25), na.rm = FALSE,
         names = TRUE, type = 7, ...)

So you need to provide a 'probs' argument that matches your desired levels, since at the moment you are getting the default levels: min, 25th percentile, median, 75th percentile and max. So try:

aggregate(Temp ~ day(index(Temp)) + month(index(Temp)), 
          FUN = 'quantile', probs=c(5, 33, 67, 95)/100 )

In retrospect it seems a bit of a programming miracle that the formula you offered would succeed: I think we would be following the examples in the 'zoo' help page for aggregate to use this:

 str( aggregate(Temp,  time(Dummy), 
            FUN = 'quantile', probs=c(5, 33, 67, 95)/100 ) )

‘zoo’ series from 1 to 4734
  Data: num [1:4734, 1:4] 0.235 -1.435 -0.922 -0.542 -1.151 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:4] "5%" "33%" "67%" "95%"
  Index:  num [1:4734] 1 2 3 4 5 6 7 8 9 10 ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM