简体   繁体   中英

sum up numbers by blocks in R

I want to sum up numbers by blocks:

Here is a sample data

 data=matrix(c(0,0,0,1,1,0,1,1,1,1,1,0,0,1,0,0,1.2,2.3,1.3,1.5,2.5,2.1,2.3,1.2),
             ncol=3,dimnames=list(c(),c("low","high","time")))

     low high time
 [1,]   0    1  1.2
 [2,]   0    1  2.3
 [3,]   0    1  1.3
 [4,]   1    0  1.5
 [5,]   1    0  2.5
 [6,]   0    1  2.1
 [7,]   1    0  2.3
 [8,]   1    0  1.2

I want to get

       n  sum
 [1,]  3  4.8
 [2,]  2  4
 [3,]  1  2.1
 [4,]  2  3.5

without using any package. How to do that with R?

Or if I can get

       n/low n/high sum
 [1,]  0       3    4.8
 [2,]  2       0    4
 [3,]  0       1    2.1
 [4,]  2       0    3.5

Not sure why the constraint on packages. They can make this much easier. We can create an index by using the unique combinations of the first two columns. Then aggregate with the index for grouping. Add a line for setting the names up and data frame structure:

ind <- with(rle(do.call(paste, df1[1:2])), rep(1:length(values), lengths))
a <- aggregate(df1$time, list(ind), function(x) c(length(x), sum(x)))[-1]
setNames(do.call(data.frame, a), c("n", "sum"))

  n sum
1 3 4.8
2 2 4.0
3 1 2.1
4 2 3.5

To illustrate how simple it is with help from data.table :

library(data.table)
setDT(df1)[, .(.N, sum(time)), by=rleid(low, high)]

Update

For follow-up question, see @bgoldst answer in comments.

A similar option, also using aggregate;

aggregate(cbind(n=1,sum=df$time), 
          by=list(c(0, cumsum(abs(diff(df$low))))), 
          FUN=sum)[-1]

I have solved the problem, I think that is a little bit complicated but it works¡¡.

Well, I have generated every column using loops.

1) I have count every change

 data<-data.frame(data)
 ind1<-vector(mode="numeric", length=0)
 ind1[1]<-1
 for(i in c(2:8))
   ind[i]<-ifelse(data[i,1:2]==data[i-1,1:2],ind1[i-1],ind1[i-1]+1)

Then I have generated the sum with loops also.

ind<-c(1.2,0,0,0)
k<-1

for(i in c(2:8)){
  if(data[i,1:2]==data[i-1,1:2]){
     ind2[k]<-ind2[k]+data[i,3]
  }else{
      k<-k+1
      ind2[k]<-ind2[k]+data[i,3]
}}


  result<-cbind(data.frame(table(ind1))$Freq,ind2)

However I have gotten some warnings, but I think that is not a problem.

I also find a similar option:

 aggregate(df,list(c(0,cumsum(abs(diff(df$low))))),sum)[-1]

For me it is more straightforward to understand.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM