简体   繁体   中英

Using R cut function on dates

I have a dataframe giving attendances at sports events

Crowd    matchDate
2345      1993-01-26
4567      1993-08-01
8888      1994-03-02
1298      1994-11-07
9876      1995-09-01 etc

1237      2011-09-09

The matchdate is a POSIXct class

I want to be able to create a season factor based on the date such that each season runs from, say, 1st August to 31 July eg factor 1992/3 would include dates 1992-08-01 to 1993-07-31

ideally it would be a function that I could apply for several analyses, not necessarily with same start and end dates in the year

An example of my comment.

x <- as.Date(1:1000, origin = "2000-01-01")
x <- cut(x, breaks = "quarter") 

And then relabel as you please, if necessary.

labs <- paste(substr(levels(x),1,4), "/", 1:4, sep="")
x <- factor(x, labels = labs)

?cut.POSIXct

breaks
a vector of cut points or number giving the number of intervals which x is to be cut into or an interval specification, one of "sec", "min", "hour", "day", "DSTday", "week", "month", "quarter" or "year", optionally preceded by an integer and a space, or followed by "s". (For "Date" objects only interval specifications using "day", "week", "month", "quarter" and "year" are allowed.)

If your question is more related to how you automatically generate the breaks and labels, maybe this will help

DF <- data.frame(matchDate = as.POSIXct(as.Date(sample(5000,100,replace=TRUE), origin="1993-01-01")))

years <- 1992:2011
DF$season <- cut(DF$matchDate, 
  breaks=as.POSIXct(paste(years,"-08-01",sep="")),
  labels=paste(years[-length(years)],years[-length(years)]+1,sep="/"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM