简体   繁体   中英

R dplyr conditional sum with mutate

I currently have a dataset in the below format

id, date,         category,      city
1, 2016-01-01,       A            CityA
2, 2016-01-01,       B            CityA

etc.

I'm trying to use mutate such that it can give me a conditional running count in the last 30 days or x time frame.

To start I tried using to see if it works and extend it from there

  mutate(df, last_thirty_day_count = sum(df$id < id & df$city == city))

But it just gives me zeroes.

Any help is appreciated.

First, here is a slightly longer example dataset

set.seed(8675309)
sampleData <-
  data_frame(id = 1:20
             , date = seq(as.Date("2017-01-01")
                          , as.Date("2017-01-20")
                          , by = "day")
             , category = sample(LETTERS[1:3], 20, TRUE)
             , city = sample(letters[1:3], 20, TRUE)
             )

Then, just decide what counts as a qualifying observation. It is unclear from your question what cut off(s) you want to use. Here, I am using January 4th as a cutoff, but you could use whatever is appropriate for your case. Then, group_by the variable you want to count for, and just add them up. This assumes that they are in in order, if they are not, make sure to arrange them first.

sampleData %>%
  mutate(QualifiyingObs = date > "2017-01-04") %>%
  group_by(city) %>%
  mutate(CountOfQual = cumsum(QualifiyingObs))

Gives

      id       date category  city QualifiyingObs CountOfQual
   <int>     <date>    <chr> <chr>          <lgl>       <int>
1      1 2017-01-01        A     a          FALSE           0
2      2 2017-01-02        B     c          FALSE           0
3      3 2017-01-03        C     c          FALSE           0
4      4 2017-01-04        C     a          FALSE           0
5      5 2017-01-05        A     b           TRUE           1
6      6 2017-01-06        C     c           TRUE           1
7      7 2017-01-07        C     a           TRUE           1
8      8 2017-01-08        C     a           TRUE           2
9      9 2017-01-09        C     a           TRUE           3
10    10 2017-01-10        B     c           TRUE           2
11    11 2017-01-11        C     c           TRUE           3
12    12 2017-01-12        B     c           TRUE           4
13    13 2017-01-13        B     a           TRUE           4
14    14 2017-01-14        A     b           TRUE           2
15    15 2017-01-15        C     a           TRUE           5
16    16 2017-01-16        C     b           TRUE           3
17    17 2017-01-17        C     b           TRUE           4
18    18 2017-01-18        A     b           TRUE           5
19    19 2017-01-19        C     a           TRUE           6
20    20 2017-01-20        C     c           TRUE           5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM