I have a species distribution dataset based on museum collections. What I want to do is list the collection towns (factor) where more than 2 species (factor) have been collected.
Thank you!
Generate 30 observations from three cities, and 20 species (labelled as numbers for easy generation)
df <- data.frame( city=as.factor( rep(c('NY', 'CH', 'LA'),10) ),
species=as.factor( sample(1:20, 30, replace=T) )
)
peek at the data
table(df$city, df$species)
Using plyr :
count observations for species in each city using ddply from plyr package, and return the observations with more than one observation
ddply(df, .(city), .fun=function(x){
counts <- count(x$species)
counts[counts$freq > 1,]
})
resulting in
city x freq
1 CH 10 3
2 CH 12 2
3 LA 9 2
4 NY 1 2
5 NY 13 3
where x is the species, and freq is the number of observations of the species in the city.
Using dplyr :
df %>%
group_by(city) %>%
select(species) %>%
count() %>%
filter(freq>1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.