简体   繁体   中英

ggplot2 - referecing summary statistics / layers

I've picked-up the ggplot2 book but I'm struggling to understand how data persists through layers.

For example, lets take a dataset and calculate the mean of each X:

thePlot = ggplot( myDF , aes_string( x = "IndepentVar" , y = "DependentVar" ) )
thePlot = thePlot + stat_summary( fun.y = mean , geom = "point" )

How do I "access" the summary statistics in the next layer? For example, lets say I want to plot a smooth line over the dataset. This seems to work:

thePlot = thePlot + stat_smooth( aes( group = 1 ) , method = "lm" , geom = "smooth" , se = FALSE )

But lets say I want to further ignore a particular X value when generating the line? How do I reference the summarized dataset to express excluding a particular X?

More generally, how is data referenced as it flows through layers? Am I always limited to the last statistics? Can I reference the original dataset?

Here is an attempt at answering your question

  1. The aesthetics defined in the ggplot call, get used as defaults in all subsequent layers if they are not explicitly defined. That is the reason geom_smooth works
  2. You can specify the data frame and aesthetics for each layer separately. For example if you want to exclude some values of x while plotting geom_smooth , you can specify subset = .(x != xvalues) inside the geom_smooth call

I can provide more detailed examples, if you have specific questions.

Hope this helps

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM