简体   繁体   中英

How to map a continuous variable to the y-axis with geom_bar?

I have some genomic data that is already binned on a per-chromosome basis and each bin has a value between 0 and 1. Here is an example:

Chr Bin Value
1   1   0.40 
1   2   0.12
1   3   0.45
1   4   0.67
2   1   0.32
2   2   0.12
3   1   0.22
3   2   0.44
3   3   0.55
4   1   0.21

The important thing to note is that the chromosomes ("Chr" above) have different lengths, thus they have a different number of bins. I can represent this correctly with the following:

ggplot(dat, aes(Chr)) + coord_flip() + geom_bar()

This simply shows a barplot of the number of bins. What I would like to do is fill the bars with the continuous value, and this is where I'm finding trouble. My attempt:

ggplot(dat, aes(x=Chr, y=Value, color=Value)) + coord_flip() + theme_bw() + geom_bar(stat="identity") + scale_color_gradient("Value", low = "white", high = "black")

This produces the kind of plot I am going for, but the y-axis is now incorrect and is not showing the count of bins (I guess because I am mapping the value to y?). How can I plot the count of bins on the y-axis and fill with a continuous variable without messing up the scale?

If I understand your question, I think you want a fill aesthetic rather than color . Also, based on your comment, it looks like Bin is a bin ID, rather than a count of bins for each value of Chr . So, in the code below, we just create a new variable bin.count equal to 1 for every value of Bin . Then we use y=bin.count so that every Bin is counted once. I've also added text labels within each bar in case you'd like to label them.

library(dplyr)

ggplot(dat %>% mutate(bin.count=1) %>%
         group_by(Chr) %>%
         mutate(bin.pos = cumsum(bin.count) - 0.5*bin.count), 
       aes(x=Chr, y=bin.count, fill=Value)) + 
  coord_flip() + theme_bw() + 
  geom_bar(stat="identity") + 
  geom_text(aes(label=paste0("Bin ID: ", Bin), y=bin.pos), colour="white") +
  scale_fill_gradient("Value", low = "white", high = "black", 
                      limits=c(0,max(dat$Value))) +
  scale_y_continuous(breaks=0:4) +
  labs(y="Bins")

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM