简体   繁体   English

当数据是要估计的参数的函数时,R中的非线性最小二乘

[英]nonlinear least squares in R when data are a function of parameters to be estimated

I'm currently migrating from matlab to R, and trying to find out if what I want to do is possible. 我目前正在从Matlab迁移到R,并试图找出我想做的事是否可行。

I want to estimate a non-linear model in R where the observations are US states. 我想在R中估算一个非线性模型,其中观测值为美国各州。 The wrinkle is that one of the independent variables is a state-level index over counties, calculated using a parameter to be estimated, ie the model looks like this: 问题在于,自变量之一是县的州级索引,使用要估算的参数来计算,即该模型如下所示:

log(Y_s) = log(phi) + log(f(theta, X_cs)) + u_s

where Y_s is a state-level variable and X_cs is a vector containing county-level observations of a variable within the state, and f() returns a scalar value of the index calculated for the state. 其中Y_s是州级变量,而X_cs是向量,其中包含该州内某个变量的县级观测值,而f()返回为该州计算的索引的标量值。

So far I've tried using R's nls function while transforming the data as it's passed to the function. 到目前为止,我已经尝试过使用R的nls函数,同时转换传递给函数的数据。 Abstracting from the details of the index, a simpler version of the code looks like this: 从索引的细节中抽象出来,一个更简单的代码看起来像这样:

library(dplyr)

state <- c("AK", "AK", "CA", "CA", "MA", "MA", "NY", "NY")
Y <- c(3, 3, 5, 5, 6, 6, 4, 4)
X <- c(4, 5, 2, 3, 3, 5, 3, 7)
Sample <- data.frame(state, Y, X)

f <- function(data, theta) {
  output <- data %>%
    group_by(state) %>%
    summarise(index = mean(X**theta),
              Y = mean(Y))
}

model <- nls(Y ~ log(phi) + log(index),
             data = f(Sample, theta),
             start = list(phi = exp(3), theta = 1.052))

This returns an error, telling me that the gradient is singular. 这将返回一个错误,告诉我渐变是奇异的。 My guess is it's because R can't see how the parameter theta should be used in the formula. 我的猜测是,因为R无法看到公式中应如何使用参数theta

Is there a way to do this using nls ? 有没有办法使用nls做到这一点? I know I could define the criterion function to be minimised manually, ie log(Y_s) - log(phi) - log(f(theta, X_cs)) , and use a minimisation routine to estimate the parameter values. 我知道我可以定义要手动最小化的准则函数,即log(Y_s) - log(phi) - log(f(theta, X_cs)) ,并使用最小化例程来估计参数值。 But I want to use the postestimation features of nls , like having a confidence interval for the parameter estimates. 但是我想使用nls的后估计功能,例如对参数估计具有置信区间。 Any help much appreciated. 任何帮助,不胜感激。

Sorry, I refuse to install that ginormous meta package. 抱歉,我拒绝安装该庞大的meta包。 Thus, I use base R: 因此,我使用基数R:

state <- c("AK", "AK", "CA", "CA", "MA", "MA", "NY", "NY")
Y <- c(3, 3, 5, 5, 6, 6, 4, 4)
X <- c(4, 5, 2, 3, 3, 5, 3, 7)
Sample <- data.frame(state, Y, X)

f <- function(X, state, theta) {
  ave(X, state, FUN = function(x) mean(x^theta))
}

model <- nls(Y ~ log(phi) + log(f(X, state, theta)),
             data = Sample, weights = 1/ave(X, state, FUN = length),
             start = list(phi = exp(3), theta = 1.052))
summary(model)
#Formula: Y ~ log(phi) + log(f(X, state, theta))
#
#Parameters:
#      Estimate Std. Error t value Pr(>|t|)
#phi   2336.867   4521.510   0.517    0.624
#theta   -2.647      1.632  -1.622    0.156
#
#Residual standard error: 0.7791 on 6 degrees of freedom
#
#Number of iterations to convergence: 11 
#Achieved convergence tolerance: 3.722e-06

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM