简体   繁体   中英

aggregate dataframe by more than one type in R

I am aggregating a dataframe using the aggregate function in R. I can get the means of each column aggregated on date and id easily like this:

aggregate(dataframe, by=list(dataframe$date, dataframe$id), FUN=mean, na.rm=TRUE)

How can I aggregate some columns as means and others as sums?

Using the summaryBy function from the doBy package could help where you can provide multiple functions:

require(doBy)
summaryBy(list("date", "id"), data = dataframe, FUN = c(mean, sum), na.rm=TRUE)

If you want specific columns to have a specific function, the data.table package probably makes it the easiest.

require(data.table)
dt <- data.table(data.frame)

# set "V1" and "V2" ... "VX" to whichever columns you are interested in
dt.out <- dt[, list(s.v1=sum(V1), m.v2=mean(V2)),
             by=c("date", "id")]

Using your code, one straight forward way is to

res1 <- aggregate(dataframe, by=list(dataframe$date, dataframe$id), FUN=mean, na.rm=TRUE)

and

res2 <- aggregate(dataframe, by=list(dataframe$date, dataframe$id), FUN=sum, na.rm=TRUE)

and then

res <- cbind(res1,res2)

Now res contains mean and sum results, you can choose whatever column you need.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM