简体   繁体   中英

Number of specific column values in consecutive rows of dataframe in R

Hello I have a quite large dataframe (~20k) with something like

id | GroupID | action
1 | 1 | run
2 | 1 | walk
3 | 1 | rest
4 | 1 | run
5 | 1 | walk
6 | 2 | run
7 | 2 | walk

I want to count the number off consecutive occurrences of action values ( for example (run,walk) which would be 3 in the above example. Also, would be good to be grouped by the groupID ( so 2 for group 1 and 1 for group 2 )

Is there any more nice way to do it except iterating row by row ?

Here's a possible solution using stringi package

Select your vars

vars <- toString(c("run", "walk"))

Then

library(stringi)
aggregate(action ~ GroupID, df, function(x) stri_count_fixed(toString(x), vars))
#   GroupID action
# 1       1      2
# 2       2      1

Or similarly, combined with the data.table package

library(data.table)
setDT(df)[, stri_count_fixed(toString(action), vars), by = GroupID]
#    GroupID V1
# 1:       1  2
# 2:       2  1

Basically we are converting action to a one long string separated by , and counting the occurrences of "run, walk" within each group.

You could use the diff function in base . This is assuming you only care about the run-walk combination, which is why rest was made an NA value. If errors arise in your actual data.frame, leave a comment so I may either adjust my answer or remove it.

> df<-data.frame(id=seq(1,7,1),
               group=c(1,1,1,1,1,2,2),
               action=c('run','walk','rest','run','walk','run','walk'))
> df$value<-ifelse(df$action=='rest',NA,
                  ifelse(df$action=='run',1,2)

> df
  id group action value
1  1     1    run     1
2  2     1   walk     2
3  3     1   rest    NA
4  4     1    run     1
5  5     1   walk     2
6  6     2    run     1
7  7     2   walk     2

> setNames(aggregate(df$value,by=list(df$group),function(x) length(which(diff(x)==1))),c('group','run_walk'))
  group run_walk
1     1        2
2     2        1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM