简体   繁体   English

如何拟合R中<0的二次模型?

[英]How to fit a quadratic model with a < 0 in R?

I'm fitting a quadratic model to the diversity of bees along an elevational gradient. 我正在沿着高程梯度拟合二次模型到蜜蜂的多样性。 I'm supposing there will be a maximum somewhere along the gradient, thus my model should have a negative "a" coefficient. 我假设在梯度的某处有一个最大值,因此我的模型应该有一个负的“a”系数。 This is working for 3 genera, but with the fourth one ( Exaerete ) the "a" becomes positive. 这适用于3属,但是对于第4属( Exaerete ),“a”变为正。 The graph below shows all 4 fits, and we can see that the blue line is the only one "incorrect": 下图显示了所有4个拟合,我们可以看到蓝线是唯一一个“不正确”:

Euglossina丰富

Isolating this genus, we can see clearly why it's "incorrect": 隔离这个属,我们可以清楚地看到它为什么“不正确”:

Exaerete丰富

There's a quadratic and a linear model together. 有一个二次和一个线性模型。 The quadratic one makes sense given the data points, but not so much sense biologically. 在给定数据点的情况下,二次方法是有意义的,但在生物学上并没有那么多意义。 I want to force the command to generate a negative "a" (thus giving an "optimum" altitude probably much lower than the given in the first graph, ie 1193 m), how can I do that? 我想强制命令生成一个负“a”(因此给出的“最佳”高度可能远低于第一个图中给出的高度,即1193 m),我该怎么做? The command in R used to generate the model was 用于生成模型的R中的命令是

fitEx2 <- lm(num~I(alt^2)+alt,data=Ex)

And the data is 数据是

Ex <- data.frame(alt=c(50,52,100,125,130,200,450,500,525,800,890,1140),
                 num=c(3,1,2,1,1,2,1,2,1,1,1,1))

We are dealing with a restricted estimation, which can be conveniently handled with, eg, nls . 我们正在处理受限估计,可以使用例如nls方便地处理。 For instance, 例如,

x <- rnorm(100)
y <- rnorm(100) - 0.01 * x^2 + 0.1 * x

nls(y ~ -exp(a) * x^2 + b * x + c, start = list(a = log(0.01), b = 0.1, c = 0))
# Nonlinear regression model
#   model: y ~ -exp(a) * x^2 + b * x + c
#    data: parent.frame()
#        a        b        c 
# -4.66893 -0.03615 -0.01949 
#  residual sum-of-squares: 97.09
# 
# Number of iterations to convergence: 2 
# Achieved convergence tolerance: 3.25e-08

where using exp helps to impose the negativity constraint. 使用exp有助于强加消极约束。 Then your desired quadratic term coefficient is 然后你想要的二次项系数是

-exp(-4.66893)
[1] -0.009382303

However, it is likely that, since lm estimates a positive coefficient, in your particular case nls will crash by approaching −∞ as to make the coefficient zero. 但是,很可能由于lm估计了正系数,因此在您的特定情况下, nls将通过接近-∞而崩溃,以使系数为零。

A more stable approach may be using, eg, optim : 可以使用更稳定的方法,例如, optim

set.seed(2)
x <- rnorm(100)
y <- rnorm(100) - 0.01 * x^2 + 0.1 * x
lm(y ~ x + I(x^2))

# Call:
# lm(formula = y ~ x + I(x^2))

# Coefficients:
# (Intercept)            x       I(x^2)  
#    -0.04359      0.04929      0.04343  

fun <- function(b) sum((y - b[1] * x^2 - b[2] * x - b[3])^2)
optim(c(-0.01, 0.1, 0), fun, method = "L-BFGS-B",
      lower = c(-Inf, -Inf, -Inf), upper = c(0, Inf, Inf))
# $par
# [1] 0.00000000 0.05222262 0.01441276
# 
# $value
# [1] 95.61239
# 
# $counts
# function gradient 
# 7        7 
# 
# $convergence
# [1] 0
# 
# $message
# [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"

which suggests a linear model. 这暗示了线性模型。 In fact, since your model is very simple, most likely that is indeed the theoretical optimum and you may want to rethink your approach. 实际上,由于您的模型非常简单,因此很可能确实是理论上的最优模型,您可能需要重新考虑您的方法。 For instance, maybe you can consider some observations as outliers and alter the estimation accordingly? 例如,您是否可以将某些观察结果视为异常值并相应地改变估计值?

The Optimization Task View lists several packages that solve least-squares problems, allowing (linear) constraints on the coefficients, for example minpack.lm . 优化任务视图列出了几个解决最小二乘问题的包,允许对系数进行(线性)约束,例如minpack.lm

library(minpack.lm)
x <- Ex$alt; y <- Ex$num
nlsLM(y ~ a*x^2 + b*x + c, 
      lower=c(-1, 0, 0), upper=c(0, Inf, Inf), 
      start=list(a=-0.01, b=0.1, c=0))
## Nonlinear regression model
##   model: y ~ a * x^2 + b * x + c
##    data: parent.frame()
##     a     b     c 
## 0.000 0.000 1.522 
##  residual sum-of-squares: 5.051
## 
## Number of iterations to convergence: 27 
## Achieved convergence tolerance: 1.49e-08

By the way, this function is also more reliable than nls and tries to avoid the "zero gradient". 顺便说一句,这个函数也比nls更可靠,并试图避免“零梯度”。
Would be helpful if users will more often take advantage of the many CRAN Task Views. 如果用户更经常地利用许多CRAN任务视图,将会很有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM