简体   繁体   中英

How can a line be overlaid on a bar plot using ggplot2?

I'm looking for a way to plot a bar chart containing two different series, hide the bars for one of the series and instead have a line (smooth if possible) go through the top of where bars for the hidden series would have been (similar to how one might overlay a freq polynomial on a histogram). I've tried the example below but appear to be running into two problems.

First, I need to summarize (total) the data by group, and second, I'd like to convert one of the series (df2) to a line.

df <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,1,2,2,3,3))  
df2 <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,4,3,5,1,2))  
ggplot(df, aes(x=grp, y=val)) +   
    geom_bar(stat="identity", alpha=0.75) +  
    geom_bar(data=df2, aes(x=grp, y=val), stat="identity", position="dodge")

You can get group totals in many ways. One of them is

with(df, tapply(val, grp, sum))

For simplicity, you can combine bar and line data into a single dataset.

df_all <- data.frame(grp = factor(levels(df$grp)))
df_all$bar_heights <- with(df, tapply(val, grp, sum))
df_all$line_y <- with(df2, tapply(val, grp, sum))

Bar charts use a categorical x-axis. To overlay a line you will need to convert the axis to be numeric.

ggplot(df_all) +
   geom_bar(aes(x = grp, weight = bar_heights)) +
   geom_line(aes(x = as.numeric(grp), y = line_y))


Perhaps your sample data aren't representative of the real data you are working with, but there are no lines to be drawn for df2 . There is only one value for each x and y value. Here's a modifed version of your df2 with enough data points to construct lines:

df <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,2,3,1,2,3))
df2 <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,4,3,5,0,2))

p <- ggplot(df, aes(x=grp, y=val)) 
p <- p + geom_bar(stat="identity", alpha=0.75) 

p + geom_line(data=df2, aes(x=grp, y=val), colour="blue")

Alternatively, if your example data above is correct, you can plot this information as a point with geom_point(data = df2, aes(x = grp, y = val), colour = "red", size = 6) . You can obviously change the color and size to your liking.

EDIT: In response to comment

I'm not entirely sure what the visual for a freq polynomial over a histogram is supposed to look like. Are the x-values supposed to be connected to one another? Secondly, you keep referring to wanting lines but your code shows geom_bar() which I assume isn't what you want? If you want lines, use geom_lines() . If the two assumptions above are correct, then here's an approach to do that:

 #First let's summarise df2 by group
 df3 <- ddply(df2, .(grp), summarise, total = sum(val))
>  df3
  grp total
1   A     5
2   B     8
3   C     3

#Second, let's plot df3 as a line while treating the grp variable as numeric

p <- ggplot(df, aes(x=grp, y=val))
p <- p + geom_bar(alpha=0.75, stat = "identity") 
p + geom_line(data=df3, aes(x=as.numeric(grp), y=total), colour = "red")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM