简体   繁体   中英

Restrict the derivatives of a quadratic fit in R

I need to fit a quadratic model to some data to predict a value (var.d) based on 3 variables (var.a,var.b,var.c). The amount of data is restricted, making it unfeasable to create a reasonable fit. But there are some reasonable assumptions that can be made which would allow at least a plausible fit:

  • Increasing var.a --> Increasing var.d
  • Increasing var.b --> Decreasing var.d
  • Increasing var.c --> Decreasing var.d

Mathematically, I can state: The partial derivate of var.a (over a defined range of var.a) must be positive. Negative for the two other variables.

The question is: How do I implement this into the lm function of R. Only thing I came up with is to do a "manual" fit-function coefficient-optimization with the calculation of the partial derivatives each time and excluding all solutions, where they do not meet my requirements...

Any smarter solution would be highly appreciated.

Below you find an example code which shows you some data and a non-restricted lm fit:

  # Get example data
df<-structure(list(var.a = c(21.2, 21.2, 21.2, 15.3, 12.4, 9.4, 
15.3, 5, 5, 12.4, 15.3, 5, 9.4, 12.4, 9.4), var.b = c(22.6, 23.8, 24.8,
25.1, 26.3, 27.5, 27.4, 36.8, 29.1, 29.1, 29.6, 33.1, 30.8, 31.7, 
33.8), var.c = c(1.65, 2.03, 2.74, 2.84, 3.4, 3.86, 4, 4.12, 
4.48, 4.88, 5.23, 5.41, 5.8, 6.42, 6.84), var.d = c(1.177, 1.196, 
1.234, 1.196, 1.179, 1.186, 1.209, 0.784, 1.116, 1.197, 1.212, 0.94, 1.149, 1.187, 1.109)),
.Names = c("var.a", "var.b", "var.c","var.d"), row.names = c(NA, -15L), class = "data.frame")

# Plot original data
library(plot3D)
par(mfrow = c(1, 1))
with(quakes, scatter3D(#x = long, y = lat, z = -depth, colvar = mag,
    x = df$var.a, y = df$var.b, z = df$var.c, colvar = df$var.d,
    type="h",
    pch = 16, cex = 1.5, 
    xlab = "df$var.a", ylab = "df$var.b",zlab = "df$var.c", clab = c("var.d"),
    xlim=c(5,21.2),ylim=c(22.6,36.8),zlim=c(0,10),
    ticktype = "detailed",
    theta = 25, d = 10,
    colkey = list(length = 0.5, width = 0.5, cex.clab = 0.75))
)

# Create a quadratic fit
# fit.df <- lm(df$var.d ~ poly(cbind(df$var.a, df$var.b, df$var.c),2, raw=T), data=df)
# This works fine, but "predict" further down the line does not like NA coefficients (different issue), that's why I translated the poly fit results to a standard lm...
fit.df <- lm(var.d ~ I(var.a^2)+I(var.a) + 
                    I(var.b^2) + I(var.b) +
                    I(var.a*var.b)+
                    I(var.a*var.c)+
                    I(var.b*var.c)+
                    I(var.c)+I(var.c^2), data=df)


# Check the model
summary(fit.df)

# Create grid for testing of partial derivatives
var.a.test<-c(5,9.4,12.4,15.3,21.2)
var.b.test<-rev(c(36.8,33.8,31.7,29.6,24.8))
var.c.test<-seq(1,10,1)
df.test<-expand.grid(var.a.test,var.b.test,var.c.test)
names(df.test)<-c("var.a","var.b","var.c")

# Predict the new var.d with the model
df.test$var.d.fit<-predict(fit.df, newdata=df.test)

# Plot the results over the grid
with(quakes, scatter3D(x = df.test$var.a, y = df.test$var.b, z = df.test$var.c, colvar = df.test$var.d,
    type="h",
    pch = 16, cex = 1.5, 
    xlab = "df.test$var.a", ylab = "df.test$var.b",zlab = "df.test$var.c", clab = "var.d",
    xlim=c(5,21.2),ylim=c(22.6,36.8),zlim=c(0,10),
    ticktype = "detailed",
    theta = 25, d = 10,
    colkey = list(length = 0.5, width = 0.5, cex.clab = 0.75))
)

# Result is not consistent with my restrictions
# No surprise, because I did not implement them yet

My idea is to use quadratic programming instead of "lm".

library(quadprog)

M <- matrix( c( rep(1,nrow(df)),
                df$var.a * df$var.a,
                df$var.b * df$var.b,
                df$var.c * df$var.c,
                df$var.a,
                df$var.b,
                df$var.c,
                df$var.a * df$var.b,
                df$var.b * df$var.c,
                df$var.c * df$var.a ),
             nrow(df),
             10        )

A <- rbind(  matrix( c( 2*df$var.a,
                       rep(1,nrow(df)),
                       df$var.b,
                       df$var.c  ),
                     nrow(df), 4      ),
            -matrix( c( 2*df$var.b,
                        rep(1,nrow(df)),
                        df$var.a,
                        df$var.c  ),
                     nrow(df), 4      ),
            -matrix( c( 2*df$var.c,
                        rep(1,nrow(df)),
                        df$var.b,
                        df$var.a  ),
                     nrow(df), 4      )  )

Aind <- rbind( t(matrix(rep(c(4,2,5,8,10),nrow(df)),5,nrow(df))),
               t(matrix(rep(c(4,3,6,8, 9),nrow(df)),5,nrow(df))),
               t(matrix(rep(c(4,4,7,9,10),nrow(df)),5,nrow(df)))  )

bvec <- rep(0,3*nrow(df))

qp <- solve.QP.compact( Dmat = t(M)%*%M,
                        dvec = t(M) %*% df$var.d,
                        Amat = t(A),
                        Aind = t(Aind),
                        bvec = bvec,
                        meq=0,
                        factorized=FALSE)

d_approx <- function(abc,x)
{
  x[1] + x[2]*abc[1]^2 + x[3]*abc[2]^2 + x[4]*abc[3]^2 + x[5]*abc[1] + x[6]*abc[2] + x[7]*abc[3] + x[8]*abc[1]*abc[2] + x[9]*abc[2]*abc[3] + x[10]*abc[3]*abc[1]
}

If "x" is the vector of coefficients we are looking for, the approximation of "d" can be written as

d_approx = x[1] + x[2]*a^2 + x[3]*b^2 + x[4]*c^2 + x[5]*a + x[6]*b + x[7]*c + x[8]*ab + x[9]*bc + x[10]*ca.

Let M be the matrix with columns 1,a^2,b^2,c^2,a,b,c,a*b,b*c,c*a . Then M %*% d is the above approximation of "d", so

t(x) %*% t(M) %*% M %*% x - 2*t(d) %*% M %*% x + t(d)%*%d

is the squared error sum. The function "solve.QP.compact" from the "quadprog" package can calculate the vector "x" that minimizes this squared error sum under additional linear constraints, written in the form A %*% x >= 0 . The partial derivative of d_approx with respect to "a" is linear in "x": 2*x[2]*a + x[5] + x[8]*b + x[10]*c . Hence the claim that this partial derivetive be positive is a feasible contraint, and similar for the other partial derivatives.

"solve.QP.compact" calculates the constraint solution as well as the unconstrained solution. The latter agrees with the result of "lp":

> qp
$solution
 [1]  2.169440e+00  1.364438e-04 -2.515349e-17 -7.111370e-18 -4.630430e-02
 [6] -3.800905e-02  8.417128e-16  1.792880e-03 -2.071742e-17 -1.688799e-17

$value
[1] -9.797515

$unconstrained.solution
 [1]  8.9121756902  0.0009357732  0.0072692251 -0.0037239683 -0.2392086131
 [6] -0.5145628191  0.6099481094  0.0099390059 -0.0123152238 -0.0133647655

$iterations
[1] 8 0

$Lagrangian
 [1] 0.87345970 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
 [7] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
[13] 0.00000000 0.00000000 0.00000000 0.51565123 0.00000000 0.26065656
[19] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
[25] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
[31] 0.30858510 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
[37] 0.00000000 0.18092816 0.01762177 0.00000000 0.00000000 0.00000000
[43] 0.00000000 0.00000000 0.34055654

$iact
[1] 39 45 16 38 31  1 18

> 

Let's see if the constraints are violated:

> d_test_approx <- d_approx(df.test[,1:3],qp$solution)

> n1 <- which(diff(df.test[,1])>0 & diff(df.test[,2])==0 & diff(df.test[,3])==0)

> which( d_test_approx[n1+1,] - d_test_approx[n1,] < 0 )
integer(0)

> n2 <- which(diff(df.test[,1])==0 & diff(df.test[,2])>0 & diff(df.test[,3])==0)

> which( d_test_approx[n2+1,] - d_test_approx[n2,] > 0 )
integer(0)

> n3 <- which(diff(df.test[,1])==0 & diff(df.test[,2])==0 & diff(df.test[,3])>0)

> which( d_test_approx[n3+1,] - d_test_approx[n3,] > 0 )
integer(0)
> 

I'm almost sure that my formulas are full of miscalculations, but the idea to use quadratic programming is perhaps worth trying.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM