简体   繁体   中英

R, DataFrame - group by multiple rows

I have a data.frame about posts looks like this:

 post_id   group_id hour(when posted) likes
 1         1        13                  5
 2         1        13                  6
 3         1        23                  3
 4         2        12                  30
 5         2        13                  34
 6         2        22                  10

I want to plot likes distribution by hours in each group, so I need a data.frame like this one:

          0 ... 12 13 ... 22 23   <- hours
gorup#1         0  11         3   <- sum of likes in group#i in xx hour
group#2         30 34     10  0

How can I group post by group and by hour?

Assuming your data.frame is called "mydf", perhaps you can try xtabs (since you're just looking for sum ):

> xtabs(likes ~ group_id + hour, mydf)
        hour
group_id 12 13 22 23
       1  0 11  0  3
       2 30 34 10  0

To get all the levels for "hour", even if they sum to "0" for all groups, factor the "hour" column first.

Another convenient alternative is to use dcast from the "reshape2" package.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM