简体   繁体   English

限制R中二次拟合的导数

[英]Restrict the derivatives of a quadratic fit in R

I need to fit a quadratic model to some data to predict a value (var.d) based on 3 variables (var.a,var.b,var.c). 我需要对一些数据拟合二次模型,以基于3个变量(var.a,var.b,var.c)预测值(var.d)。 The amount of data is restricted, making it unfeasable to create a reasonable fit. 数据量受到限制,因此创建合理的拟合是不可行的。 But there are some reasonable assumptions that can be made which would allow at least a plausible fit: 但是,可以做出一些合理的假设,这些假设至少允许一个合理的拟合:

  • Increasing var.a --> Increasing var.d 增加变种a->增加变种d
  • Increasing var.b --> Decreasing var.d 增加var.b->减少var.d
  • Increasing var.c --> Decreasing var.d 增加var.c->减少var.d

Mathematically, I can state: The partial derivate of var.a (over a defined range of var.a) must be positive. 从数学上讲,我可以说:变量a的偏导数(在变量a的定义范围内)必须为正。 Negative for the two other variables. 其他两个变量均为负数。

The question is: How do I implement this into the lm function of R. Only thing I came up with is to do a "manual" fit-function coefficient-optimization with the calculation of the partial derivatives each time and excluding all solutions, where they do not meet my requirements... 问题是:如何将其实现到R的lm函数中。我想到的唯一事情是每次计算偏导数时都要进行“手动”拟合函数系数优化,并且排除所有解,其中他们不符合我的要求...

Any smarter solution would be highly appreciated. 任何更智能的解决方案将不胜感激。

Below you find an example code which shows you some data and a non-restricted lm fit: 在下面,您将找到一个示例代码,该示例代码向您显示了一些数据和非限制性lm拟合:

  # Get example data
df<-structure(list(var.a = c(21.2, 21.2, 21.2, 15.3, 12.4, 9.4, 
15.3, 5, 5, 12.4, 15.3, 5, 9.4, 12.4, 9.4), var.b = c(22.6, 23.8, 24.8,
25.1, 26.3, 27.5, 27.4, 36.8, 29.1, 29.1, 29.6, 33.1, 30.8, 31.7, 
33.8), var.c = c(1.65, 2.03, 2.74, 2.84, 3.4, 3.86, 4, 4.12, 
4.48, 4.88, 5.23, 5.41, 5.8, 6.42, 6.84), var.d = c(1.177, 1.196, 
1.234, 1.196, 1.179, 1.186, 1.209, 0.784, 1.116, 1.197, 1.212, 0.94, 1.149, 1.187, 1.109)),
.Names = c("var.a", "var.b", "var.c","var.d"), row.names = c(NA, -15L), class = "data.frame")

# Plot original data
library(plot3D)
par(mfrow = c(1, 1))
with(quakes, scatter3D(#x = long, y = lat, z = -depth, colvar = mag,
    x = df$var.a, y = df$var.b, z = df$var.c, colvar = df$var.d,
    type="h",
    pch = 16, cex = 1.5, 
    xlab = "df$var.a", ylab = "df$var.b",zlab = "df$var.c", clab = c("var.d"),
    xlim=c(5,21.2),ylim=c(22.6,36.8),zlim=c(0,10),
    ticktype = "detailed",
    theta = 25, d = 10,
    colkey = list(length = 0.5, width = 0.5, cex.clab = 0.75))
)

# Create a quadratic fit
# fit.df <- lm(df$var.d ~ poly(cbind(df$var.a, df$var.b, df$var.c),2, raw=T), data=df)
# This works fine, but "predict" further down the line does not like NA coefficients (different issue), that's why I translated the poly fit results to a standard lm...
fit.df <- lm(var.d ~ I(var.a^2)+I(var.a) + 
                    I(var.b^2) + I(var.b) +
                    I(var.a*var.b)+
                    I(var.a*var.c)+
                    I(var.b*var.c)+
                    I(var.c)+I(var.c^2), data=df)


# Check the model
summary(fit.df)

# Create grid for testing of partial derivatives
var.a.test<-c(5,9.4,12.4,15.3,21.2)
var.b.test<-rev(c(36.8,33.8,31.7,29.6,24.8))
var.c.test<-seq(1,10,1)
df.test<-expand.grid(var.a.test,var.b.test,var.c.test)
names(df.test)<-c("var.a","var.b","var.c")

# Predict the new var.d with the model
df.test$var.d.fit<-predict(fit.df, newdata=df.test)

# Plot the results over the grid
with(quakes, scatter3D(x = df.test$var.a, y = df.test$var.b, z = df.test$var.c, colvar = df.test$var.d,
    type="h",
    pch = 16, cex = 1.5, 
    xlab = "df.test$var.a", ylab = "df.test$var.b",zlab = "df.test$var.c", clab = "var.d",
    xlim=c(5,21.2),ylim=c(22.6,36.8),zlim=c(0,10),
    ticktype = "detailed",
    theta = 25, d = 10,
    colkey = list(length = 0.5, width = 0.5, cex.clab = 0.75))
)

# Result is not consistent with my restrictions
# No surprise, because I did not implement them yet

My idea is to use quadratic programming instead of "lm". 我的想法是使用二次编程而不是“ lm”。

library(quadprog)

M <- matrix( c( rep(1,nrow(df)),
                df$var.a * df$var.a,
                df$var.b * df$var.b,
                df$var.c * df$var.c,
                df$var.a,
                df$var.b,
                df$var.c,
                df$var.a * df$var.b,
                df$var.b * df$var.c,
                df$var.c * df$var.a ),
             nrow(df),
             10        )

A <- rbind(  matrix( c( 2*df$var.a,
                       rep(1,nrow(df)),
                       df$var.b,
                       df$var.c  ),
                     nrow(df), 4      ),
            -matrix( c( 2*df$var.b,
                        rep(1,nrow(df)),
                        df$var.a,
                        df$var.c  ),
                     nrow(df), 4      ),
            -matrix( c( 2*df$var.c,
                        rep(1,nrow(df)),
                        df$var.b,
                        df$var.a  ),
                     nrow(df), 4      )  )

Aind <- rbind( t(matrix(rep(c(4,2,5,8,10),nrow(df)),5,nrow(df))),
               t(matrix(rep(c(4,3,6,8, 9),nrow(df)),5,nrow(df))),
               t(matrix(rep(c(4,4,7,9,10),nrow(df)),5,nrow(df)))  )

bvec <- rep(0,3*nrow(df))

qp <- solve.QP.compact( Dmat = t(M)%*%M,
                        dvec = t(M) %*% df$var.d,
                        Amat = t(A),
                        Aind = t(Aind),
                        bvec = bvec,
                        meq=0,
                        factorized=FALSE)

d_approx <- function(abc,x)
{
  x[1] + x[2]*abc[1]^2 + x[3]*abc[2]^2 + x[4]*abc[3]^2 + x[5]*abc[1] + x[6]*abc[2] + x[7]*abc[3] + x[8]*abc[1]*abc[2] + x[9]*abc[2]*abc[3] + x[10]*abc[3]*abc[1]
}

If "x" is the vector of coefficients we are looking for, the approximation of "d" can be written as 如果“ x”是我们正在寻找的系数向量,则“ d”的近似值可以写为

d_approx = x[1] + x[2]*a^2 + x[3]*b^2 + x[4]*c^2 + x[5]*a + x[6]*b + x[7]*c + x[8]*ab + x[9]*bc + x[10]*ca. d_approx = x [1] + x [2] * a ^ 2 + x [3] * b ^ 2 + x [4] * c ^ 2 + x [5] * a + x [6] * b + x [ 7] * c + x [8] * ab + x [9] * bc + x [10] * ca。

Let M be the matrix with columns 1,a^2,b^2,c^2,a,b,c,a*b,b*c,c*a . M为具有列1,a^2,b^2,c^2,a,b,c,a*b,b*c,c*a的矩阵。 Then M %*% d is the above approximation of "d", so 那么M %*% d是上面的“ d”的近似值,所以

t(x) %*% t(M) %*% M %*% x - 2*t(d) %*% M %*% x + t(d)%*%d

is the squared error sum. 是误差平方和。 The function "solve.QP.compact" from the "quadprog" package can calculate the vector "x" that minimizes this squared error sum under additional linear constraints, written in the form A %*% x >= 0 . 来自“ quadprog”包的函数“ solve.QP.compact”可以计算向量“ x”,该向量在附加线性约束下以A %*% x >= 0的形式最小化该平方误差和。 The partial derivative of d_approx with respect to "a" is linear in "x": 2*x[2]*a + x[5] + x[8]*b + x[10]*c . d_approx关于“ a”的偏导数在“ x”中是线性的: 2*x[2]*a + x[5] + x[8]*b + x[10]*c Hence the claim that this partial derivetive be positive is a feasible contraint, and similar for the other partial derivatives. 因此,这种偏导数为正的说法是可行的约束,对于其他偏导数也是如此。

"solve.QP.compact" calculates the constraint solution as well as the unconstrained solution. “ solve.QP.compact”计算约束解以及无约束解。 The latter agrees with the result of "lp": 后者同意“ lp”的结果:

> qp
$solution
 [1]  2.169440e+00  1.364438e-04 -2.515349e-17 -7.111370e-18 -4.630430e-02
 [6] -3.800905e-02  8.417128e-16  1.792880e-03 -2.071742e-17 -1.688799e-17

$value
[1] -9.797515

$unconstrained.solution
 [1]  8.9121756902  0.0009357732  0.0072692251 -0.0037239683 -0.2392086131
 [6] -0.5145628191  0.6099481094  0.0099390059 -0.0123152238 -0.0133647655

$iterations
[1] 8 0

$Lagrangian
 [1] 0.87345970 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
 [7] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
[13] 0.00000000 0.00000000 0.00000000 0.51565123 0.00000000 0.26065656
[19] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
[25] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
[31] 0.30858510 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
[37] 0.00000000 0.18092816 0.01762177 0.00000000 0.00000000 0.00000000
[43] 0.00000000 0.00000000 0.34055654

$iact
[1] 39 45 16 38 31  1 18

> 

Let's see if the constraints are violated: 让我们看看是否违反了约束:

> d_test_approx <- d_approx(df.test[,1:3],qp$solution)

> n1 <- which(diff(df.test[,1])>0 & diff(df.test[,2])==0 & diff(df.test[,3])==0)

> which( d_test_approx[n1+1,] - d_test_approx[n1,] < 0 )
integer(0)

> n2 <- which(diff(df.test[,1])==0 & diff(df.test[,2])>0 & diff(df.test[,3])==0)

> which( d_test_approx[n2+1,] - d_test_approx[n2,] > 0 )
integer(0)

> n3 <- which(diff(df.test[,1])==0 & diff(df.test[,2])==0 & diff(df.test[,3])>0)

> which( d_test_approx[n3+1,] - d_test_approx[n3,] > 0 )
integer(0)
> 

I'm almost sure that my formulas are full of miscalculations, but the idea to use quadratic programming is perhaps worth trying. 我几乎可以确定我的公式充满了错误的计算,但是使用二次编程的想法也许值得尝试。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM