简体   繁体   中英

rowwise tabulation in data.table

Haveing a data.table as follows:

   station       w_1       w_2
1:    1757     ar_2d lm_h_step
2:    2171 lm_h_step lm_h_step
3:    2812 lm_h_step lm_h_step
4:    4501 lm_h_step lm_h_step
5:    4642     ar_2d lm_h_step
6:    5029     ar_2d lm_h_step
7:    5480 lm_h_step lm_h_step
8:    5779     ar_2d     ar_2d
9:    5792     ar_1d     ar_2d

I'd like to tabulate the frequency of the methods per station.
So the expected result would be

          1757  2171  2812 ...
lm_h_step    1     2     2
ar_2d        1     0     0 
ar_1d        0     0     0 ...

What i have tried so far:

apply(dat,1,table)

is producing the right result, but it is not propperly formated.

Any ideas?

Dput of the data:

structure(list(station = c(1757L, 2171L, 2812L, 4501L, 4642L, 
                           5029L, 5480L, 5779L, 5792L), w_1 = c("ar_2d", "lm_h_step", "lm_h_step", 
                                                                "lm_h_step", "ar_2d", "ar_2d", "lm_h_step", "ar_2d", "ar_2d"), 
               w_2 = c("lm_h_step", "lm_h_step", "lm_h_step", "lm_h_step", 
                       "lm_h_step", "lm_h_step", "lm_h_step", "ar_2d", "ar_2d")), .Names = c("station", 
                                                                                             "w_1", "w_2"), class = c("data.table", "data.frame"), row.names = c(NA, 
                                                                                                                                                                 -9L))

Try dcast/melt combination

For data.table v >= 1.9.5 use this

dcast(melt(dat, "station"), value ~ station, length)
#        value 1757 2171 2812 4501 4642 5029 5480 5779 5792
# 1:     ar_1d    0    0    0    0    0    0    0    0    1
# 2:     ar_2d    1    0    0    0    1    1    0    2    1
# 3: lm_h_step    1    2    2    2    1    1    2    0    0

For data.table v < 1.9.5 you will also need to load reshape2 and explicitly use dcast.data.table (because reshape2::dcast isn't generic and doesn't have a dcast.data.table method).

reshape2::melt , on the other hand, is generic (see methods(melt) ) and has a melt.data.table method so you won't need to tell it anything. It will know which method you want to use depending on the class of dat

require(reshape2)
dcast.data.table(melt(dat, "station"), value ~ station, length)
#        value 1757 2171 2812 4501 4642 5029 5480 5779 5792
# 1:     ar_1d    0    0    0    0    0    0    0    0    1
# 2:     ar_2d    1    0    0    0    1    1    0    2    1
# 3: lm_h_step    1    2    2    2    1    1    2    0    0

If you are not picky with strictly using data.table methods, you can also use reshape2::recast (see @shadows comment) which is a wrapper for the solution above but using reshape2::dcast instead of dcast.data.table and thus will return a data.frame object instead of a data.table

recast(dat, value ~ station, id.var = "station", length)
#       value 1757 2171 2812 4501 4642 5029 5480 5779 5792
# 1     ar_1d    0    0    0    0    0    0    0    0    1
# 2     ar_2d    1    0    0    0    1    1    0    2    1
# 3 lm_h_step    1    2    2    2    1    1    2    0    0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM