I am trying to figure out a way to display only the top three bars of a data set. In order to make things simple, I'm using the diamond data set to illustrate what I'd like to do. First, I ordered it by largest to smallest.
library(data.table)
diamonds <- data.table(diamonds)
diamonds1 <- within(diamonds, cut <- factor(cut, levels=names(sort(table(cut), decreasing=TRUE))))
Then, I plotted.
ggplot(diamonds1, aes(cut, fill=cut)) + geom_bar(position="dodge") + guides(fill=FALSE) + ylab("Count") + xlab("Cut")
And I got this:
But instead of seeing all of the bars, I just want to see the top three. Additionally, I want this to be repeatable, so if the data set changes and there is a different top three, I can use the same code to create the correct top three. Is there any way to do this?
Sure, you can define xlim()
. Add:
+ xlim('Ideal', 'Premium', 'Very Good')
Edit after @Arun comments below: A more direct approach would be to subset the data before you feed it to ggplot()
. You can use data.table
's features to make this very fast
setkey(diamonds, cut) ## needed for fast subsetting and grouping
tt <- diamonds[, list(count=.N), by=cut] ## same as table(diamonds$cut) but faster
cut.values <- tt[order(count), cut][1:3] ## select top 3 cut values by count
ggplot(diamonds[J(cut.values)], ... ## run the same plot commands on subset of data
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.