简体   繁体   中英

How to plot R linear model lm() using scatterplot3d via $plane3d function

Stackoverflow community,

I am looking to plot the results of R's lm() as a plane in a 3d graph made with the scatterplot3d() command from the R package scatterplot3d. I keep getting multiple errors, depending on my method of trying to graph via the $plane3d() function.

First, some reproducible data - Step 1: making the data-frame

elem <- data.frame(pH = c(8.12, 8.19, 6.09, 5.99, 5.18, 6.09, 5.40, 5.50, 4.93, 5.16, 7.57, 7.21, 5.13, 6.23, 5.72), 
              water_Loss = c(0.010, 0.005, 0.065, 0.120, 0.250, 0.305, 0.100, 0.020, 0.430, 0.060, 0.065, 0.050, 0.025, 0.050, 0.020), 
              elev = c(2397, 2393, 2593, 2599, 2741, 2774, 2979, 2787, 3173, 3370, 2147, 2130, 2374, 2359, 2643), 
              co2 = c(1.8410, 1.9810, 2.0110, 1.8960, 1.3060, 2.0160, 1.7360, 1.5860, 1.6360, 1.9665, 1.6360, 1.7660, 1.9760, 2.7510, 1.3310))

Step 2 - fitting the linear model

lms <- lm(elem$co2 ~ elem$pH + elem$water_Loss + elem$elev + I(elem$pH * elem$water_Loss * elem$elev))
summary(lms)

To note: there aren't lms$model$x and lms$model$y parameters on the results of this linear model

Step 3 - making the 3d graph

library(scatterplot3d)

s3d <- scatterplot3d(elem[, -4], pch = 19, type = "p", grid = TRUE, box = FALSE, angle = 55)

To graph the scatterplot graph + lm() result, the s3d$plane3d() is run directly after plotting the graph first.

Like this:

s3d <- scatterplot3d(elem[, -4], pch = 19, type = "p", grid = TRUE, box = FALSE, angle = 55) s3d$plane3d()

However, moving forward, I will only indicate the s3d$plane3d() portion.

This is where the issue comes in. I will highlight 3 different ways I have tried to get the linear model to display on this graph

Attempt 1: Plotting the results of lms directly

s3d$plane3d(lms, draw_polygon = TRUE, draw_lines = TRUE)

Which produces the following error:

Error in xy.coords(x, y, setLab = FALSE) : 'x' and 'y' lengths differ

Looking into fixing this, I went here: 'x' and 'y' lengths differ ERROR when plotting

Like the post suggested, I used the flag lm(x = TRUE, y = TRUE) but I still had no such parameters in the lm() results to be able to check the length() and this did not work.

Attempt 2: Specifying the x,y,z coordinates with intercept in the scatterplot

I was following this suggestion to do so: R - Extending Linear Model beyond scatterplot3d

For the intercept flag, I used the following code: lms$coefficients and took the value under (Intercept).

s3d$plane3d(xyz.coords(x = as.numeric(lms$model$`elem$pH`), y = as.numeric(lms$model$`elem$water_Loss`), z = as.numeric(lms$model$`elem$elev`)), Intercept = 3.010033e+00  )

Which produces the following error:

Error in x.coef * x.scal : non-numeric argument to binary operator

Attempt 3: Plotting individual coefficients & attempting to draw polygons and lines

I tried it this way after reading the documentation in R for the planes3d() command from {rgl}

Getting the coefficients:

coefs <- coef(lms) ; coefs
s3d$plane3d(a = -5.901006e-02 , b = -1.546285e+01, c = -2.946729e-04, Intercept = 3.010033e+00)

Which produces the following error:

Error in x.coef * x.scal : non-numeric argument to binary operator

I also tried to add the flags draw_polygon = TRUE, draw_lines = TRUE to the above command, which only gave yet another error - bottom line - did not work.

At this point, I am at a complete loss (I've tried many other methods - which I can't post them all). I would like to ask help in trying to pin-point what exactly I am missing to plot this plane on this graph. Any and all help will be very much appreciated.

Thank you.

scatterplot3d() will not be able to plot models with larger dimensionality (than 2 input dimensions and 1 output dimension) in 3D. In fact, such a plot would not be valid since the values in the additional dimensions will presumably be different for the different observations. They would therefore influence how closely the model fits and a plot that neglects these would be misleading.

That said, s3d$plane3d does not handle malformed input very well. For instance, if the dimensionality of the model is not as expected, it will return confusing error messages (as you have seen). There is also no help for this function and in fact the function is nested in another function in the package and has no comments. As a result this will all be fairly difficult to understand, but if you want to go deeper you have to read the code of the package, which you can find here .

You can absolutely have your plot show a partial regression surface, but you have to tell plot3d, which dimensions you want. Essentially you'd be plotting a plane in 3d space where you should have a hyperplane in higher dimensional space.

Your attempt 2 was on the right track. But you do not hand over the right argument. The function wants x.coef and y.coef etc. but not xyz.coords and therefore it apparently tries to interpret the vectors you hand over as coefficient and fails. You could do this instead:

s3d$plane3d(Intercept=lms$coefficients["(Intercept)"][[1]],
            x.coef=lms$coefficients["elem$pH"][[1]],
            y.coef=lms$coefficients["elem$water_Loss"][[1]],
            draw_polygon = TRUE, 
            draw_lines = TRUE, 
            polygon_args = list(col = rgb(0.8, 0.8, 0.8, 0.8)))

However, it is unlikely, however, that you will even see the regression surface in your plot, because the influence of the dimensions you do not plot will shift it out of the visible area of your figure. If you want to pull it back by force, you have to modify your intercept:

average_intercept <- lms$coefficients["(Intercept)"][[1]] + lms$coefficients["elem$elev"][[1]] * mean(elem$elev)
s3d$plane3d(Intercept=average_intercept,
            x.coef=lms$coefficients["elem$pH"][[1]],
            y.coef=lms$coefficients["elem$water_Loss"][[1]],
            draw_polygon = TRUE, 
            draw_lines = TRUE, 
            polygon_args = list(col = rgb(0.8, 0.8, 0.8, 0.8)))

But the plane you see is actually only a 2d-slice through the 3d-surface that is your regression and accurately only represents the observations you that happen to have exactly the average value in that third dimension ( elev in your case).

In fact, this is exactly what you would get if you ran the regression without the additional dimension(s); so you might as well do and plot that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM