简体   繁体   中英

(Using a custom function to) Sum above N rows in a datatable (dataframe) by groups

I need a function that sums the above N+1 rows in dataframes (data tables) by groups.

An equivalent function for a vector , would be something like below. (Please forgive me if the function below is inefficient)

Function1<-function(x,N){
  y<-vector(length=length(x))
for (i in 1:length(x))
if (i<=N) 
  y[i]<-sum(x[1:i])
else if (i>N) 
  y[i]<-sum(x[(i-N):i])
return(y)}

Function1(c(1,2,3,4,5,6),3)
#[1] 1 3 6 10 14 18 # Sums previous (above) 4 values (rows)

I wanted to use this function with sapply, like below..

sapply(X=DF<-data.frame(A=c(1:10), B=2), FUN=Function1(N=3))

but couldn't.. because I could not figure out how to set a default for the x in my function. Thus, I built another function for data.frames.

Function2<-function(x, N)
 if(is.data.frame(x)) {
y<-data.frame()
for(j in 1:ncol(x))
  for(i in 1:nrow(x))
    if (i<=N) {
      y[i,j]<-sum(x[1:i,j])
    }   else if (i>N)  {
      y[i,j]<-sum(x[(i-N):i,j])}
return(y)}

DF<-data.frame(A=c(1:10), B=2)
Function2(DF, 2)
#   V1 V2
1   1  2
2   3  4
3   6  6
4   9  6
5  12  6
6  15  6
7  18  6
8  21  6
9  24  6
10 27  6

However, I still need to perform this by groups. For example, for the following data frame with a character column.

DF<-data.frame(Name=rep(c("A","B"),each=5), A=c(1:10), B=2)

I would like to apply my function by group "Name" -- which would result in.

A   1  2
A   3  4
A   6  6
A   9  6
A  12  6
B   6  2
B  13  4
B  21  6
B  24  6
B  27  6


#Perform function2 separately for group A and B.

I was hoping to use function with the data.table package (by=Groups), but couldn't figure out how.

What would be the best way to do this? (Also, it would be really nice, if I could learn how to make my Function1 to work in sapply)

With data.table , we group by 'Name', loop through the columns of interest specified in .SDcols (here all the columns are of interest so we are not specifying it) and apply the Function1

library(data.table)
setDT(DF)[, lapply(.SD, Function1, 2), Name]
#    Name  A B
# 1:    A  1 2
# 2:    A  3 4
# 3:    A  6 6
# 4:    A  9 6
# 5:    A 12 6
# 6:    B  6 2
# 7:    B 13 4
# 8:    B 21 6
# 9:    B 24 6
#10:    B 27 6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM