简体   繁体   中英

Aggregating over 2 columns of a dataframe R

My dataframe is as follows

TreeID    Species    PlotNo    Basalarea
12345       A          1         120
13242       B          7         310
14567       D          8         250
13245       B          1         305
13426       B          1         307
13289       A          3         118

I used

newdata<- aggregate(Basalarea~PlotNo+Species, data, sum, na.rm=TRUE)

to aggregate all the values such that

 newdata
     Species    PlotNo    Basalarea
       A          1         120
       A          3         118
       B          1         some value
       B          7         310
       D          8         250

This is great but I would like a dataframe such that

PlotNo    A       B            D
 1        120    some value    0
 3        118    0             0
 7        0      310           0
 8        0      0            250

How do I obtain the above dataframe?

We can use dcast to convert from long to wide format. Specify the fun.aggregate as sum .

library(reshape2)
dcast(df1, PlotNo~Species, value.var='Basalarea', sum)
#  PlotNo   A   B   D
#1      1 120 612   0
#2      3 118   0   0
#3      7   0 310   0
#4      8   0   0 250

Or a base R option would be using xtabs . By default it gets the sum of the 'Basalarea' for the combinations of 'PlotNo' and 'Species'.

xtabs(Basalarea~PlotNo+Species, df1)
#     Species
#PlotNo   A   B   D
#     1 120 612   0
#     3 118   0   0
#     7   0 310   0
#     8   0   0 250

Or another base R option is tapply

with(df1, tapply(Basalarea, list(PlotNo, Species), FUN=sum))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM