简体   繁体   中英

R group by year

I read a csv into R and now I have a list of data.

head(data)

       Date   Open   High    Low  Close  Volume
1 31-Dec-14 223.09 225.68 222.25 222.41 2402097
2 30-Dec-14 223.99 225.65 221.40 222.23 2903242
3 29-Dec-14 226.90 227.91 224.02 225.71 2811828
4 26-Dec-14 221.51 228.50 221.50 227.82 3327016
5 24-Dec-14 219.77 222.50 219.25 222.26 1333518
6 23-Dec-14 223.81 224.32 219.52 220.97 4513321

tail(data)
Date  Open  High   Low Close  Volume
499 9-Jan-13 34.01 34.19 33.40 33.64  697979
500 8-Jan-13 34.50 34.50 33.11 33.68 1283985
501 7-Jan-13 34.80 34.80 33.90 34.34  441909
502 4-Jan-13 34.80 34.80 33.92 34.40  673993
503 3-Jan-13 35.18 35.45 34.75 34.77  741941
504 2-Jan-13 35.00 35.45 34.70 35.36 1194710

This is the stock price of a stock foreach day over a 2 year period from January 1st 2013 - December 31st 2014. For now I just want to be able to group by year, for any function or formula.

So, let's say I want: median(data$Close)

returns: 177.515

Is there a way to tell R to return these numbers for each of the two years as opposed to just all data?

eg combining R with a familiar SQL statement:

median(data$Close)
GROUP BY YEAR(Date);

I'm hoping to get something returned like:

2013 167.5
2014 175

You could try (with the help of lubridate package)

require(lubridate)
years <- year(as.Date(data$Date, "%d-%b-%y"))
tapply(data$Close, years, median)

Or you could use (with built-in R functions)

dates <- as.Date(data$Date, "%d-%b-%y")
years <- format(dates, "%Y")
tapply(data$Close, years, median)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM