简体   繁体   中英

geom_vline vertical line on x-axis with categorical data: ggplot2

I have data that is ordered in classes, as described in this article: https://www.r-bloggers.com/from-continuous-to-categorical/ This makes it easier to see which values are common. After creating those classes I want to create a barchart with the frequency of the different classes, which I do with the following exemplary code:

set.seed(1)
df.v <- data.frame(val = rnorm(1000, mean(4, sd=2)))
df.v$val.clss <- cut(df.v$val, seq(min(df.v$val), max(df.v$val), 1))
p1 <- ggplot(data = df.v)+
  geom_bar(aes(val.clss))
plot(p1)

What I can not figure out, is how to add a vertical line exactly between the two bars around 4, so the line is perfectly at the x-axis value. I have found this article, but this did not help me: How to get a vertical geom_vline to an x-axis of class date? Any help is appreciated. Maybe I am too new to adapt the solution to my data.frame, if so, please excuse the question.

Do you want something like this?

p1 <- ggplot(data = df.v)+
  geom_bar(aes(val.clss)) + geom_vline(xintercept = 3.5, col='red', lwd=2)
plot(p1)

在此处输入图片说明

More generic solution could be this:

df.v <- data.frame(val = rnorm(1000, mean=15, sd=4))
df.v$val.clss <- cut(df.v$val, seq(min(df.v$val), max(df.v$val), 1))

lvls <- levels(df.v$val.clss)
lvls
[1] "(2.97,3.97]" "(3.97,4.97]" "(4.97,5.97]" "(5.97,6.97]" "(6.97,7.97]" "(7.97,8.97]" "(8.97,9.97]" "(9.97,11]"   "(11,12]"     "(12,13]"    
[11] "(13,14]"     "(14,15]"     "(15,16]"     "(16,17]"     "(17,18]"     "(18,19]"     "(19,20]"     "(20,21]"     "(21,22]"     "(22,23]"    
[21] "(23,24]"     "(24,25]"     "(25,26]"     "(26,27]"     "(27,28]"     "(28,29]"     "(29,30]"    

vline.level <- '(18,19]' # you want to draw line here, right before 18

p1 <- ggplot(data = df.v)+
+   geom_bar(aes(val.clss)) + geom_vline(xintercept = which(lvls == vline.level) - 0.5, col='red', lwd=2) +
+   theme(axis.text.x = element_text(angle=90, vjust = 0.5))
plot(p1)

在此处输入图片说明

If you want to choose the middlemost level,

length(lvls)
#[1] 27
# choose the middlemost level, since length(lvls) is odd in this case, the midpoint will be ceiling(length(lvls)/2)
vline.level <- lvls[ceiling(length(lvls)/2)] 

p1 <- ggplot(data = df.v)+
  geom_bar(aes(val.clss)) + geom_vline(xintercept = which(lvls == vline.level) - 0.5, col='red', lwd=2) +
  theme(axis.text.x = element_text(angle=90, vjust = 0.5))
plot(p1)

在此处输入图片说明

If you know the labels for the two bars you want the line to go between, you can convert their locations to numbers (the factor that they are mapped to), then pass that:

myLoc <- 
  (which(levels(df.v$val.clss) == "(2.99,3.99]") +
     which(levels(df.v$val.clss) == "(3.99,4.99]")) / 
  2


p1 +
  geom_vline(aes(xintercept = myLoc))

If it is skipping groups, you should probably make sure that all levels of the factor are plotted. When you have binned continuous data, it is best not to drop intermediate levels.

p1 +
  geom_vline(aes(xintercept = myLoc)) +
  scale_x_discrete(drop = FALSE)

Alternatively, you could drop the missing levels from the data all together (prior to plotting and to calculating myLoc ):

df.v <- droplevels(df.v)

Then it will only include the that would be plotted.

As a final option, you could just use geom_histogram which does the binning automatically, but leaves the data on the original scale, which would make adding a line easier.

ggplot(df.v
       , aes(val)) +
  geom_histogram(binwidth = 1) +
  geom_vline(xintercept = 4)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM