简体   繁体   中英

Orthogonal Linear Regression (total least squares) fit, get RMSE and R-squared in R

I am trying to fit a model that linearly relates two variables using R. I need to fit a Orthogonal Linear Regression (total least squares). So I'm trying to use the odregress() function of the pracma package which performs an Orthogonal Linear Regression via PCA .

Here an example data:

x <- c(1.0, 0.6, 1.2, 1.4, 0.2, 0.7, 1.0, 1.1, 0.8, 0.5, 0.6, 0.8, 1.1, 1.3, 0.9)
y <- c(0.5, 0.3, 0.7, 1.0, 0.2, 0.7, 0.7, 0.9, 1.2, 1.1, 0.8, 0.7, 0.6, 0.5, 0.8)

I'm able to fit the model and get the coefficient using:

odr <- odregress(y, x)
c <- odr$coeff

So, the model is defined by the following equation:

print(c)
[1]  0.65145762 -0.03328271

Y = 0.65145762*X - 0.03328271

Now I need to plot the line fit, compute the RMSE and the R-squared. How can I do that?

plot(x, y)

Here are two functions to compute the MSE and RMSE.

library(pracma)

x <- c(1.0, 0.6, 1.2, 1.4, 0.2, 0.7, 1.0, 1.1, 0.8, 0.5, 0.6, 0.8, 1.1, 1.3, 0.9)
y <- c(0.5, 0.3, 0.7, 1.0, 0.2, 0.7, 0.7, 0.9, 1.2, 1.1, 0.8, 0.7, 0.6, 0.5, 0.8)

odr <- odregress(y, x)

mse_odreg <- function(object) mean(object$resid^2)
rmse_odreg <- function(object) sqrt(mse_odreg(object))

rmse_odreg(odr)
#> [1] 0.5307982

Created on 2023-01-10 with reprex v2.0.2


Edit

The R^2 can be computed with the following function. Note that odr$ssq is not the sum of the squared residuals, odr$resid , it is the sum of the squared errors, odr$err .

r_squared_odreg <- function(object, y) {
  denom <- sum((y - mean(y))^2)
  1 - object$ssq/denom
}
r_squared_odreg(odr, y)
#> [1] 0.1494818

Created on 2023-01-10 with reprex v2.0.2

Here is another alternative to solve an Orthogonal Linear Regression (total least squares) via PCA according to what is explained in this post . It actually does the same as pracma::odregress .

x <- c(1.0, 0.6, 1.2, 1.4, 0.2, 0.7, 1.0, 1.1, 0.8, 0.5, 0.6, 0.8, 1.1, 1.3, 0.9)
y <- c(0.5, 0.3, 0.7, 1.0, 0.2, 0.7, 0.7, 0.9, 1.2, 1.1, 0.8, 0.7, 0.6, 0.5, 0.8)

In this case we perform a Principal Component Analysis using the prcomp() function.

v <- prcomp(cbind(x,y))$rotation

Then we calculate the slope ( m ) from the firs principal component and the intercept ( n ):

# Y = mX + n
m <- v[2,1]/v[1,1]
n <- mean(y) - (m*mean(x))

Our model is defined by: f <- function(x){(m*x) + n}

We can plot it using:

plot(x, y)
abline(n, m, col="blue")

Finally we plot the Total Least Squares fit versus the Ordinary Least Squares fit.

plot(x, y)
abline(n, m, col="blue")
abline(lm(y~x), col="red")
legend("topleft", legend=c("TLS", "OLS"), col=c("blue", "red"), lty=1, bty="n")

TLS 与 OLS

As you can see we obtain the same results as in pracma::odregress :

odr <- odregress(y, x)
print(odr$coeff)
print(paste(round(m, digits=7), round(n, digits=7)))

[1] 0.5199081 0.2558142
[1] 0.5199081 0.2558142

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM