简体   繁体   中英

R ggplot2: using stat_summary (mean) and logarithmic scale

I have a bunch of measurements over time and I want to plot them in R. Here is a sample of my data. I've got 6 measurements for each of 4 time points:

values <- c (1012.0, 1644.9, 837.0, 1200.9, 1652.0, 981.5, 
    2236.9, 1697.5, 2087.7, 1500.8,
    2789.3, 1502.9, 2051.3, 3070.7, 3105.4, 
    2692.5, 1488.5, 1978.1, 1925.4, 1524.3,
    2772.0, 1355.3, 2632.4, 2600.1)
time <- factor (rep (c(0, 12, 24, 72), c(6, 6, 6, 6)))

The scale of these data is arbitrary, and in fact I'm going to normalize it so that the average of t=0 is 1.

norm <- values / mean (values[time == 0])

So far so good. Using ggplot , I plot both the individual points, as well as a line that goes through the average at each time point:

require (ggplot2)
p <- ggplot(data = data.frame(time, norm), mapping = aes (x = time, y = norm)) +
    stat_summary (fun.y = mean, geom="line", mapping = aes (group = 1)) +
    geom_point()

However, now I want to apply a logarithmic scale, and this is where my trouble starts. When I do:

q <- ggplot(data = data.frame(time, norm), mapping = aes (x = time, y = norm)) +
    stat_summary (fun.y = mean, geom="line", mapping = aes (group = 1)) +
    geom_point() + 
    scale_y_log2()

The line does NOT go through 0 at t=0, as you would expect because log (1) == 0. Instead the line crosses the y-axis slightly below 0. Apparently, ggplot applies the mean after log transformation, which gives a different result. I want it to take the mean before log transformation.

How can I tell ggplot to apply the mean first? Is there a better way to create this chart?

scale_y_log2() will do the transformation first and then calculate the geoms.

coord_trans() will do the opposite: calculate the geoms first, and the transform the axis.

So you need coord_trans(ytrans = "log2") instead of scale_y_log2()

A work around to solve it, if you don´t want to use coord_trans() and still want to transform the data, is to create a function which will back transform it:

f1 <- function(x) {
  log10(mean(10 ^ x)) 
}

stat_summary (fun.y = f1, geom="line", mapping = aes (group = 1))

The best solution I found for this issue was to use a combo of coord_trans() and scale_y_continuous(breaks = breaks)

As previously suggested, using coord_trans will scale your axis without transforming the data, however it will leave you with an ugly axis.

Setting the limits in coord_trans works for some things, but if you want to fix your axis to have specific labels, you will then include scale_y_continuous with the breaks you'd like set.

coord_trans(y = 'log10') +
scale_y_continuous(breaks = breaks)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM