Extracting origin and slope values from facet-grided ggplot2 scatter plots [R]

I've been helped by @Ista to produce the follow code. From this code, I now need to save:

  • Q1: the average 'm' value at nR = 0 for each of the 8 scatter plots;
  • Q2: the value of the slope of the regression (shown here with stat_smooth) for only a subset of the data (let's say, all data from nR = 0 to nR = 50). As you can see in the example, the regression is computed for the whole dataset;
  • data should be saved as to be reused for another visualisation with ggplot2. I'm not sure if this point matters, but I'm mentioning it just in case.


md = read.csv(file="http://dl.dropboxusercontent.com/u/73950/rob-136.csv", sep=",", header=TRUE)

dM = melt(md,c("id"))
# split variable out into its components
dM <- cbind(dM,
                 pattern = "_",
                 names = c("Nm", "order", "category"))) 
# no longer need variable, as it is represented by the combination of Nm, order, and category
dM$variable <- NULL
# rearrange putting category in the columns
dM <- dcast(dM, ... ~ category, value.var = "value")

# plot
p = ggplot(dM, aes(x=nR ,y=m))
p = p + scale_y_continuous(name="m")+ scale_x_continuous(name="nR") + xlim(0,136)
p = p + facet_grid(order~Nm)+ ggtitle("Title")
p = p + stat_bin2d(bins=50)
myPalette <- colorRampPalette(rev(brewer.pal(11, "Spectral")))
p = p + scale_fill_gradientn(colours = myPalette(100))
p = p + theme(legend.position="none")
p = p + stat_smooth(method = 'lm')


I hope this is clear enough, but please let me know if I'm not making sense...


I would do it like this.

myPalette <- colorRampPalette(rev(brewer.pal(11, "Spectral")))

# plot
ggplot(dM, aes(x = nR, y = m)) +
  scale_y_continuous(name = "m") + 
  scale_x_continuous(name = "nR") + 
  xlim(0, 136) +
  facet_grid(order ~ Nm) + 
  ggtitle("Title") +
  stat_bin2d(bins = 50) +
  scale_fill_gradientn(colours = myPalette(100)) +
  theme(legend.position="none") +
  stat_smooth(method = "lm")

out <- by(data = dM, INDICES = list(dM$order, dM$Nm), FUN = function(x) {
  model <- lm(m ~ nR, data = x)
    intercept = coef(model)["(Intercept)"], 
    nR = coef(model)["nR"], 
    nM = paste(unique(x$Nm), "/", unique(x$order), sep = "")
do.call("rbind", out)

              intercept            nR     nM
(Intercept)  0.07883183  0.0190254723 FS/GED
(Intercept)1 0.11683689  0.0007879976 FS/NAR
(Intercept)2 0.14945769  0.0036112481 N2/GED
(Intercept)3 0.16427558  0.0017356170 N2/NAR
(Intercept)4 0.48951709  0.0025201291 SW/GED
(Intercept)5 0.51569334  0.0011353692 SW/NAR
(Intercept)6 0.65299500  0.0012065830 VE/GED
(Intercept)7 0.64931290 -0.0001557808 VE/NAR


Or, you can introduce a second term into your model equation and produce "curvy" curves.

ggplot(dM, aes(x = nR, y = m)) +
  scale_y_continuous(name = "m") + 
  scale_x_continuous(name = "nR") + 
  xlim(0, 136) +
  facet_grid(order ~ Nm) + 
  ggtitle("Title") +
  stat_bin2d(bins = 50) +
  scale_fill_gradientn(colours = myPalette(100)) +
  theme(legend.position="none") +
  stat_smooth(method = "lm", formula = y ~ poly(x, 2))

out <- by(data = dM, INDICES = list(dM$order, dM$Nm), FUN = function(x) {
  x <- na.omit(x)
  model <- lm(m ~ poly(nR, 2), data = x)
  coef(model) # you will need to explicitly add where the data has come from, see my original post above
do.call("rbind", out)

     (Intercept) poly(nR, 2)1 poly(nR, 2)2
[1,]   0.2137094    1.9243960   0.36091764
[2,]   0.1585034    1.7404097  -0.10095676
[3,]   0.2945133    5.2416769   1.54143778
[4,]   0.2569599    3.8638288   0.95676753
[5,]   0.5992618    4.0801451   1.38542308
[6,]   0.5752421    2.4440650   0.08027183
[7,]   0.7034230    1.8407400  -0.20495917
[8,]   0.6411417   -0.3364163  -0.02258523


