简体   繁体   中英

Counting observations per unique year in group in data.table

I want to count the number of unique years since the first observation in a data.table. Example below:

data <- structure(list(club = c("Ajax", "Ajax", "Ajax", "Ajax", "Ajax", 
"Ajax", "Barcelona", "Barcelona", "Barcelona"), years = c(1994, 
1994, 1994, 1995, 1995, 1995, 2014, 2014, 2015)), row.names = c(NA, 
-9L), class = c("data.table", "data.frame"))
        club years
1:      Ajax  1994
2:      Ajax  1994
3:      Ajax  1994
4:      Ajax  1995
5:      Ajax  1995
6:      Ajax  1995
7: Barcelona  2014
8: Barcelona  2014
9: Barcelona  2015

I want this:

        club years count
1:      Ajax  1994     1
2:      Ajax  1994     1
3:      Ajax  1994     1
4:      Ajax  1995     2
5:      Ajax  1995     2
6:      Ajax  1995     2
7: Barcelona  2014     1
8: Barcelona  2014     1
9: Barcelona  2015     2

Use match on the unique 'years' by 'club' and assign ( := ) the output to a new column

data[, count := match(years, unique(years)), by = club]

Or may use rleid

data[, count := rleid(years), club]

Base R solution:

df$count <- with(
  df,
  unlist(
    tapply(
      years,
      club, 
      FUN = function(x){
        as.integer(
          factor(
            x, 
            levels = unique(
              x
            )
          )
        )
      }
    )
  )
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM