简体   繁体   中英

Using ggplot2, how can I create a histogram or bar plot where the last bar is the count of all values greater than some number?

I would like to plot a histogram of my data to show its distribution, but I have a few outliers that are really high compared to most of the values, which are < 1.00. Rather than having one or two bars scrunched up at the far left and then nothing until the very far right side of the graph, I'd like to have a histogram with everything except the outliers and then add a bar at the end where the label underneath it is ">100%". I can do that with ggplot2 using geom_bar() like this:

 X <- c(rnorm(1000, mean = 0.5, sd = 0.2), 
   rnorm(10, mean = 10, sd = 0.5))
 Data <- data.frame(table(cut(X, breaks=c(seq(0,1, by=0.05), max(X)))))

 library(ggplot2)
 ggplot(Data, aes(x = Var1, y = Freq)) + geom_bar(stat = "identity") +
  scale_x_discrete(labels = paste0(c(seq(5,100, by = 5), ">100"), "%"))

直方图 The problem is that, for the size I need this to be, the labels end up overlapping or needing to be plotted at an angle for readability. I don't really need all of the bars labeled. Is there some way to either

  • A) plot this in a different manner other than geom_bar() so that I don't need to manually add that last bar or
  • B) only label some of the bars?

I will try to answer B.

I don't know if there is a parameter that would let you do B) but you can manually define a function to do that for you. Ie:

library(ggplot2)
X <- c(rnorm(1000, mean = 0.5, sd = 0.2), 
       rnorm(10, mean = 10, sd = 0.5))
Data <- data.frame(table(cut(X, breaks=c(seq(0,1, by=0.05), max(X)))))

#the function will remove one label every n labels
remove_elem <- function(x,n) {
  for (i in (1:length(x))) {
    if (i %% n == 0) {x[i]<-''}
  }  
  return(x)  
}

#make inital labels outside ggplot (same way as before). 
labels <-paste0(c(seq(5,100, by = 5),'>100'),'%')

Now using that function inside the ggplot function:

ggplot(Data, aes(x = Var1, y = Freq)) + geom_bar(stat = "identity") +
  scale_x_discrete(labels = remove_elem(labels,2))

outputs:

在此输入图像描述

I don't know if this is what you are looking for but it does the trick!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM