I have data like below:
Caller Date Duration Status
304 2/1/2016 756 ANSWERED
304 2/1/2016 61 ANSWERED
304 2/4/2016 60 ANSWERED
304 2/10/2016 61 ANSWERED
304 2/17/2016 60 ANSWERED
304 2/19/2016 30 ANSWERED
304 2/24/2016 27 ANSWERED
304 2/28/2016 55 ANSWERED
304 2/28/2016 63 ANSWERED
I want to group the data in R, based on week, ie if hte date lies between 2/1/2017 and 2/7/2017 I add a new column called "week" and place the value as Week 1 for those tuples. similarly for all other weeks in month.
The output would look as such
Caller Date Duration Status Week
304 2/1/2016 756 ANSWERED Week 1
304 2/1/2016 61 ANSWERED Week 1
304 2/4/2016 60 ANSWERED Week 1
304 2/10/2016 61 ANSWERED Week 2
304 2/17/2016 60 ANSWERED Week 2
304 2/19/2016 30 ANSWERED Week 3
304 2/24/2016 27 ANSWERED Week 4
304 2/28/2016 55 ANSWERED Week 4
304 2/28/2016 63 ANSWERED Week 4
Please suggest me a method in R. thanks
One way to do this would be to use lubridate
and dplyr
Suppose your data is in a data frame called dat
:
library(lubridate)
library(dplyr)
dat$Date <- mdy(dat$Date)
t0 <- dat[1, 2]
dat %>% mutate(Week = paste('Week', as.integer(Date - t0) / 7) + 1))
Result:
Caller Date Duration Status Week
1 304 2016-02-01 756 ANSWERED Week 1
2 304 2016-02-01 61 ANSWERED Week 1
3 304 2016-02-04 60 ANSWERED Week 1
4 304 2016-02-10 61 ANSWERED Week 2
5 304 2016-02-17 60 ANSWERED Week 3
6 304 2016-02-19 30 ANSWERED Week 3
7 304 2016-02-24 27 ANSWERED Week 4
8 304 2016-02-28 55 ANSWERED Week 4
9 304 2016-02-28 63 ANSWERED Week 4
You can pull the week of the year directly with
format(as.Date("2016-07-01"), format = "Week %U")
See the help for strptime
for more details on the formatting. Note, for example, that it only gives week of the year -- so 2017-01-01 will be before anything in 2016. You could write a wrapper similar to @ManishGoel's answer that would set your starting point as week 1.
A more generic solution is to use cut
:
mycuts <- seq(as.Date("2016-01-01"), as.Date("2017-12-30"), 7 )
cut(as.Date("2016-07-01"), mycuts, labels = 1:(length(mycuts)-1))
That may be easier to scale for your needs, and applies more broadly to other classes of problems. If you really need the "Week" in there, you can do that directly too:
cut(as.Date("2016-07-01"), mycuts, labels = paste("Week", 1:(length(mycuts)-1)))
You can extract the day using strsplit
and then calculate the week from the date.
Week <- sapply(df$Date, FUN = function(x){
day <- as.numeric(strsplit(as.character(x),"/")[[1]]2]);
return(as.integer(day/7)+1)
})
df$Week <- Week
Though, you need to give more information regarding how the dates are distributed cause calculation of week number depends on that.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.