简体   繁体   中英

Collapse a matrix to sum values in one column by values in another

I have a matrix with three columns: county, date, and number of ED visits. The dates repeat for each county, like this (just an example):

  County A  1/1/2012  2
  County A  1/2/2012  0
  County A  1/3/2012  5
  ... etc.
  County B  1/1/2012  3
  County B  1/2/2012  4
  ... etc.

I would like to collapse this matrix to sum the visits from all counties for each date. So it would look like this:

1/1/2012  5
1/2/2012  4
etc.

I am trying to use the "table()" function in R but can't seem to get it to operate on visits by date in this manner. When I do "table(dt$date, dt$Visits)" it gives me a table of frequencies like this:

             0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  2011-01-01 3 1 2 0 1 1 0 2 0 0  0  0  0  0  0  0
  2011-01-02 2 3 1 0 0 1 0 0 1 0  2  0  0  0  0  0
  2011-01-03 3 1 1 2 1 0 0 0 0 1  0  0  0  0  1  0

Any suggestions? Is there a better function to use, perhaps a "sum" of some sort?

Thanks!

table() is not for summation of values, it is for record counts. If you want to use tapply you get a table output and can apply the sum function. Or you can use ave to get a sum vector of equal length as the dataframe. Perhaps:

  with( EDcounts, tapply(EDcounts[[3]], EDcounts[[2]], sum, na.rm=TRUE) )

You maybe able to coax xtabs into summation of visit counts as well if you put the "visits" column name on the LHS of the formula.

As @DWin states, table() is not for summation, but for record counts.

I give three examples of approaches, using plyr , data.table and aggregate

all_data <- expand.grid(country = paste('Country', LETTERS[1:3]), 
  date = seq(as.Date('2012/01/01'), as.Date('2012/12/31'), by = 1) )

all_data[['ed_visits']] <- rpois(nrow(all_data), lambda = 5)



# using plyr

library(plyr)

by_date_plyr <- ddply(all_data, .(date), summarize, visits = sum(ed_visits))


# using data.table
library(data.table)
all_DT <- data.table(all_data)
by_date_dt <- all_DT[, list(visits = sum(ed_visits)), by = 'date' ]

# using aggregate
by_date_base <- aggregate(ed_visits ~ date, data = all_data, sum)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM