简体   繁体   中英

Create new column in R based upon values in two columns

I have the following df, which summarizes values based on date:

df <- data_frame(
ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20),
Date = c("28/01/2019", "28/01/2019", "29/01/2019", "29/01/2019", 
         "30/01/2019", "30/01/2019", "31/01/2019", "31/01/2019", 
         "01/02/2019", "01/02/2019", "04/02/2019", "04/02/2019", 
         "05/02/2019", "05/02/2019", "06/02/2019", "06/02/2019", 
         "07/02/2019", "07/02/2019", "08/02/2019", "08/02/2019"),
WeekYear = c("2019-05", "2019-05", "2019-05", "2019-05", "2019-05", "2019- 
              05", "2019-05", "2019-05", "2019-05", "2019-05", "2019-06", 
             "2019-06", "2019-06", "2019-06", "2019-06", "2019-06", "2019- 
              06", "2019-06", "2019-06", "2019-06"),
DayFilter = c("Public Holiday", "Public Holiday", "Work-Day", "Work-Day", 
              "Work-Day", "Work-Day", "Work-Day", "Work-Day", "Work-Day", 
              "Work-Day", "Work-Day", "Work-Day", "Work-Day", "Work-Day", 
              "Work-Day", "Work-Day", "Work-Day", "Work-Day", "Work-Day", 
              "Work-Day"),
Value = c(0, 0, 7, 3, 8, 4, 1, 0, 9, 6, 4, 5, 7, 2, 4, 9, 5, 7, 4, 1))

df

      ID Date       WeekYear  DayFilter      Value
   <dbl> <chr>      <chr>    <chr>          <dbl>
 1     1 28/01/2019 2019-05  Public Holiday     0
 2     2 28/01/2019 2019-05  Public Holiday     0
 3     3 29/01/2019 2019-05  Work-Day           7
 4     4 29/01/2019 2019-05  Work-Day           3
 5     5 30/01/2019 2019-05  Work-Day           8
 6     6 30/01/2019 2019-05  Work-Day           4
 7     7 31/01/2019 2019-05  Work-Day           1
 8     8 31/01/2019 2019-05  Work-Day           0
 9     9 01/02/2019 2019-05  Work-Day           9
10    10 01/02/2019 2019-05  Work-Day           6
11    11 04/02/2019 2019-06  Work-Day           4
12    12 04/02/2019 2019-06  Work-Day           5
13    13 05/02/2019 2019-06  Work-Day           7
14    14 05/02/2019 2019-06  Work-Day           2
15    15 06/02/2019 2019-06  Work-Day           4
16    16 06/02/2019 2019-06  Work-Day           9
17    17 07/02/2019 2019-06  Work-Day           5
18    18 07/02/2019 2019-06  Work-Day           7
19    19 08/02/2019 2019-06  Work-Day           4
20    20 08/02/2019 2019-06  Work-Day           1

My goal is to create a new column that categorizes any week of the year that has a public holiday in it as incomplete , based upon the weekyear column. This way I can limit analyses to weeks where there were five business days

Desire output below

        ID    Date    WeekYear DayFilter      WeekFilter    Value
       <dbl> <chr>    <chr>    <chr>          <chr>         <dbl>
   1     1 28/01/2019 2019-05  Public Holiday Incomplete     0
   2     2 28/01/2019 2019-05  Public Holiday Incomplete     0
   3     3 29/01/2019 2019-05  Work-Day       Incomplete     7
   4     4 29/01/2019 2019-05  Work-Day       Incomplete     3
   5     5 30/01/2019 2019-05  Work-Day       Incomplete     8
   6     6 30/01/2019 2019-05  Work-Day       Incomplete     4
   7     7 31/01/2019 2019-05  Work-Day       Incomplete     1
   8     8 31/01/2019 2019-05  Work-Day       Incomplete     0
   9     9 01/02/2019 2019-05  Work-Day       Incomplete     9
  10    10 01/02/2019 2019-05  Work-Day       Incomplete     6
  11    11 04/02/2019 2019-06  Work-Day       Complete       4
  12    12 04/02/2019 2019-06  Work-Day       Complete       5
  13    13 05/02/2019 2019-06  Work-Day       Complete       7
  14    14 05/02/2019 2019-06  Work-Day       Complete       2
  15    15 06/02/2019 2019-06  Work-Day       Complete       4
  16    16 06/02/2019 2019-06  Work-Day       Complete       9
  17    17 07/02/2019 2019-06  Work-Day       Complete       5
  18    18 07/02/2019 2019-06  Work-Day       Complete       7
  19    19 08/02/2019 2019-06  Work-Day       Complete       4
  20    20 08/02/2019 2019-06  Work-Day       Complete       1

My attempts so far have successfully created a WeekFilter column with Incomplete as a value. However, it is limited to the specific date of the public holiday. The part I am struggling to figure out is how to generalize it to the rest of the week, which I figure needs to incorporate the WeekYear column into this statement

df$WeekFilter = df$WeekYear
df$WeekFilter[df$DayFilter == "Public Holiday"] <- "Incomplete"

Any assistance would be greatly appreciated

We may use group_by and mutate :

df %>% group_by(WeekYear) %>% 
  mutate(WeekFilter = if("Public Holiday" %in% DayFilter) "Incomplete" else "Complete")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM