简体   繁体   中英

R ggplot stat_summary: how to include a count of NAs in legend?

I am trying to plot one discrete variable on the x-axis against a continuous one on the y. Imagine in mtcars that I am trying to plot cyl vs. disp. What if some of the values of disp were NA? I would like to know how many NA there were for each value of cyl, and to display this in a simple table, possibly right below the legend (or within the legend itself). Is there a simple (or a complicated) way to do this?

Similar and related question I posed: R - looking at means by subgroup and overall on a line graph

Thanks!

This answer does not meet all question requirements, but since the details on how exactly the data should be presented are a little vague, I'm posting anyway.

So here's a way to add NA counts to the legend itself:

library(datasets)
mycars <- mtcars
mycars$disp[c(1,2,3)] <- NA

lvls = levels(as.factor(mycars$cyl))
nacounts <- by(mycars, mycars$cyl, function(x) sum(is.na(x$disp)))
labels = paste(lvls," (NA=",as.integer(nacounts),")",sep="")

ggplot(data=mycars) +
   geom_boxplot(aes(x=cyl,y=disp, fill=as.factor(cyl)))  +
   scale_fill_discrete(name="Cyl", labels=labels)

结果

EDIT

Relating to the stat_summary graph referred-to in the question: labels describing line types can be added using the scale_linetype_* functions.

In case you'd like to have the same legend as in the image above, I think you'll have to add graph elements describing cyl, eg:

ggplot(mycars,aes(cyl,disp)) +
  stat_summary(fun.y=mean, geom="line", lwd=1.5) +
  stat_summary(aes(lty=factor(vs)),fun.y="mean",geom="line") +
  stat_summary(aes(color=factor(cyl)),fun.y="mean",geom="point",size=5) +
  scale_x_continuous(breaks=c(4,6,8),labels=c("four","6","8")) +
  scale_color_discrete(labels=labels)

点几何叠加的绘图

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM