Suppose I have an R dataframe that looks like this, where end.group
signifies the end of a unique group of observations:
x <- data.frame(end.group=c(0,0,1,0,0,1,1,0,0,0,1,1,1,0,1))
I want to return the following, where group.count
is a running count of the number of observations in a group, and group
is a unique identifier for each group, in number order. Can anyone help me with a piece of R code to do this?
end.group group.count group
0 1 1
0 2 1
1 3 1
0 1 2
0 2 2
1 3 2
1 1 3
0 1 4
0 2 4
0 3 4
1 4 4
1 1 5
1 1 6
0 1 7
1 2 7
You can create group
by using cumsum
and rev
. You need rev
because you have the end points of the groups.
x <- data.frame(end.group=c(0,0,1,0,0,1,1,0,0,0,1,1,1,0,1))
# create groups
x$group <- rev(cumsum(rev(x$end.group)))
# re-number groups from smallest to largest
x$group <- abs(x$group-max(x$group)-1)
Now you can use ave
to create group.count
.
x$group.count <- ave(x$end.group, x$group, FUN=seq_along)
x <- data.frame(end.group=c(0,0,1,0,0,1,1,0,0,0,1,1,1,0,1))
ends <- which(as.logical(x$end.group))
ends2 <- c(ends[1],diff(ends))
transform(x, group.count=unlist(sapply(ends2,seq)), group=rep(seq(length(ends)),times=ends2))
end.group group.count group
1 0 1 1
2 0 2 1
3 1 3 1
4 0 1 2
5 0 2 2
6 1 3 2
7 1 1 3
8 0 1 4
9 0 2 4
10 0 3 4
11 1 4 4
12 1 1 5
13 1 1 6
14 0 1 7
15 1 2 7
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.