I have a large database which I want to simplify by grouping observations into transects. I have used the following code:
library(dplyr)
AGGDATA<-DATA %>%
select(Habitat,Transect,Number,Abundance) %>%
group_by(Transect) %>%
mutate(TotalNum = sum(Number),TotalAbund = sum(Abundance))
Sample output for DATA$Abundance
looks like this:
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[24] 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 16 9 6 1 21 5
[47] 83 32 10 1 24 2 16 85 7 4 0 21 1 7 7 9 4 76 0 1 2 2 1
[70] 9 2 0 3 6 41 4 3 5 0 0 0 0 0 0 0 0 0 0 0 0 1 0
[93] 0 0 0 0 0 0 0 0 0 78 14 3 1 10 44 5 0 2 2 31 1 3 18
And sample output for AGGDATA$TotalAbund
looks like this:
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[19] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[37] 1 1 1 1 351 351 351 351 351 351 351 351 351 351 351 351 351 351
[55] 351 351 351 351 351 351 175 175 175 175 175 175 175 175 175 175 175 175
[73] 175 175 175 175 175 175 175 175 1 1 1 1 1 1 1 1 1 1
The code has summed DATA$Abundance
values for each transect. However, I would like one value per transect rather than one value repeated for each transect observation. I'm still new to this so I hope that makes sense.
Can anyone help? Thanks!!
I would recommend you to use data.table library. It is much faster. In case of you didn't provide a data set, so your solution can look like
library(data.table)
DATA <- data.table(DATA)
AGGDATA <- DATA[, .(TotalNum = sum(Number),TotalAbund = sum(Abundance)), by = Transect]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.