简体   繁体   English

根据条件将数据分组为新的列值

[英]Group data into new column value based of condition

I have data like below: 我有如下数据:

Caller  Date    Duration    Status
304 2/1/2016    756 ANSWERED
304 2/1/2016    61  ANSWERED
304 2/4/2016    60  ANSWERED
304 2/10/2016   61  ANSWERED
304 2/17/2016   60  ANSWERED
304 2/19/2016   30  ANSWERED
304 2/24/2016   27  ANSWERED
304 2/28/2016   55  ANSWERED
304 2/28/2016   63  ANSWERED

I want to group the data in R, based on week, ie if hte date lies between 2/1/2017 and 2/7/2017 I add a new column called "week" and place the value as Week 1 for those tuples. 我想基于周将数据分组到R中,即,如果日期位于2/1/2017到2/7/2017之间,则添加一个名为“ week”的新列,并将这些元组的值设置为Week 1。 similarly for all other weeks in month. 在一个月中的所有其他星期类似。

The output would look as such 输出看起来像这样

Caller  Date    Duration    Status Week
304 2/1/2016    756 ANSWERED   Week 1
304 2/1/2016    61  ANSWERED   Week 1
304 2/4/2016    60  ANSWERED   Week 1
304 2/10/2016   61  ANSWERED   Week 2
304 2/17/2016   60  ANSWERED   Week 2
304 2/19/2016   30  ANSWERED   Week 3
304 2/24/2016   27  ANSWERED   Week 4
304 2/28/2016   55  ANSWERED   Week 4
304 2/28/2016   63  ANSWERED   Week 4

Please suggest me a method in R. thanks 请建议我使用R的方法。谢谢

One way to do this would be to use lubridate and dplyr 一种方法是使用lubridatedplyr

Suppose your data is in a data frame called dat : 假设您的数据位于名为dat的数据帧中:

library(lubridate)
library(dplyr)
dat$Date <- mdy(dat$Date)
t0 <- dat[1, 2]
dat %>% mutate(Week = paste('Week', as.integer(Date - t0) / 7) + 1)) 

Result: 结果:

Caller       Date Duration   Status   Week
1    304 2016-02-01      756 ANSWERED Week 1
2    304 2016-02-01       61 ANSWERED Week 1
3    304 2016-02-04       60 ANSWERED Week 1
4    304 2016-02-10       61 ANSWERED Week 2
5    304 2016-02-17       60 ANSWERED Week 3
6    304 2016-02-19       30 ANSWERED Week 3
7    304 2016-02-24       27 ANSWERED Week 4
8    304 2016-02-28       55 ANSWERED Week 4
9    304 2016-02-28       63 ANSWERED Week 4

You can pull the week of the year directly with 您可以直接使用

format(as.Date("2016-07-01"), format = "Week %U")

See the help for strptime for more details on the formatting. 有关格式的更多详细信息,请参见strptime帮助。 Note, for example, that it only gives week of the year -- so 2017-01-01 will be before anything in 2016. You could write a wrapper similar to @ManishGoel's answer that would set your starting point as week 1. 请注意,例如,它只给出一年中的某个星期,因此2017年1月1日将早于2016年。您可以编写类似于@ManishGoel的答案的包装,将您的起点设置为第1周。

A more generic solution is to use cut : 一个更通用的解决方案是使用cut

mycuts <- seq(as.Date("2016-01-01"), as.Date("2017-12-30"), 7 )
cut(as.Date("2016-07-01"), mycuts, labels = 1:(length(mycuts)-1))

That may be easier to scale for your needs, and applies more broadly to other classes of problems. 这样可以更轻松地满足您的需求,并且可以更广泛地应用于其他类别的问题。 If you really need the "Week" in there, you can do that directly too: 如果您确实需要在那里的“周”,也可以直接执行以下操作:

cut(as.Date("2016-07-01"), mycuts, labels = paste("Week", 1:(length(mycuts)-1)))

You can extract the day using strsplit and then calculate the week from the date. 您可以使用strsplit提取日期,然后根据日期计算星期。

Week <- sapply(df$Date, FUN = function(x){
  day <- as.numeric(strsplit(as.character(x),"/")[[1]]2]);
  return(as.integer(day/7)+1)
})
df$Week <- Week

Though, you need to give more information regarding how the dates are distributed cause calculation of week number depends on that. 但是,您需要提供有关日期分配方式的更多信息,因为周数的计算取决于此。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM