简体   繁体   English

使用 nnet 包评估 R 中多项式 logit 的拟合优度

[英]Assesing the goodness of fit for the multinomial logit in R with the nnet package

I use the multinom() function from the nnet package to run the multinomial logistic regression in R. The nnet package does not include p-value calculation and t-statistic calculation.我使用 nnet 包中的multinom()函数在 R 中运行多项逻辑回归。 nnet 包不包括 p 值计算和 t 统计量计算。 I found a way to calculate the p-values using the two tailed z-test from this page .我找到了一种使用this page中的两个tailed z-test计算p值的方法。 To give one example of calculating a test statistic for a multinom logit (not really a t-stat, but an equivalent) I calculate the Wald's statistic:举一个计算多项式 logit 的检验统计量的例子(不是真正的 t-stat,而是等价的),我计算了 Wald 统计量:

mm<-multinom(Empst ~ Agegroup + Marst + Education + State, 
             data = temp,weight=Weight)
W <- (summary(mm1)$coefficients)^2/(summary(mm1)$standard.errors)^2

I take the square of a coefficient and divide by the square of the coefficient's standard error.我取系数的平方并除以系数的标准误差的平方。 However, the likelihood-ratio test is the preferable measure of a goodness of fit for the logistic regressions.然而,似然比检验是衡量逻辑回归拟合优度的首选方法。 I do not know how to write code that will calculate the likelihood ratio statistic for each coefficient due to the incomplete understanding of the likelihood function.由于对似然函数的理解不完整,我不知道如何编写代码来计算每个系数的似然比统计量 What would be the way to calculate the likelihood-ratio statistic for each coefficient using the output from the multinom() function?使用multinom()函数的输出计算每个系数的似然比统计量的方法是什么? Thanks for your help.谢谢你的帮助。

Let's look at predicting Sepal.Length from the iris dataset using Species (a categorical variable) and Petal.Length (a continuous variable).让我们看看使用Species (一个分类变量)和Petal.Length (一个连续变量)从 iris 数据集预测Sepal.Length Let's start by converting our factor variable into multiple binary variables using model.matrix and building our neural network:让我们首先使用model.matrix将我们的因子变量转换为多个二元变量并构建我们的神经网络:

library(nnet)
data(iris)
mat <- as.data.frame(model.matrix(~Species+Petal.Length+Sepal.Length, data=iris))
mm <- multinom(Sepal.Length~.+0, data=mat, trace=F)

Now we can run a likelihood ratio test for a variable in our model:现在我们可以对模型中的变量进行似然比检验:

library(lmtest)
lrtest(mm, "Speciesversicolor")
# Likelihood ratio test
# 
# Model 1: Sepal.Length ~ `(Intercept)` + Speciesversicolor + Speciesvirginica + 
#     Petal.Length + 0
# Model 2: Sepal.Length ~ `(Intercept)` + Speciesvirginica + Petal.Length - 
#     1
#   #Df  LogLik  Df  Chisq Pr(>Chisq)
# 1 136 -342.02                      
# 2 102 -346.75 -34 9.4592          1

To run the likelihood ratio test for all your variables, I guess you could just use a loop and run for each variable name.要对所有变量运行似然比测试,我想您可以只使用循环并为每个变量名称运行。 I've extracted just the p-values in this loop.我只提取了这个循环中的 p 值。

for (var in mm$coefnames[-1]) {
  print(paste(var, "--", lrtest(mm, var)[[5]][2]))
}
# [1] "Speciesversicolor -- 0.999990077592342"
# [1] "Speciesvirginica -- 0.998742545590864"
# [1] "Petal.Length -- 3.36995663002528e-14"

Use the Anova function in the car package for the likelihood-ratio test of each term in your model.使用car包中的Anova函数对模型中的每一项进行似然比检验。

library(nnet)
data(iris)


mm <- multinom(Species ~ ., data=iris, trace=F)

### car package
library(car)
Anova(mm)

From the response of @jolisber i extracted a function so anyone can do this and store the values in a df.从@jolisber 的响应中,我提取了一个函数,以便任何人都可以执行此操作并将值存储在 df 中。 Well, i stored the full character vector in the df.好吧,我将完整的字符向量存储在 df 中。

likehoodmultinom2 <- function(model_lmm) 
{

  i <- 1
  values<- c("No funciona") 

  for (var in model_lmm$coefnames[-1]) { # Qutiamos el -1 de coefnames para no obener un NA

  values[i] =(paste(var, "--", lrtest(model_lmm, var)[[5]][2]))
  i=i+1

  }
  return (values)
}

However i cant get the first element (variable) p-value.但是我无法获得第一个元素(变量)p 值。 I dont know why.我不知道为什么。 And i cant ignore the [-1] in model_lmm$coefnames.而且我不能忽略 model_lmm$coefnames 中的 [-1]。 EDITED.已编辑。 I edited i=0 to i=1;我将 i=0 编辑为 i=1; forgot that R vectors start at that :D.忘记了 R 向量从那个开始:D。

Hope this works for everyone :D希望这对每个人都有效:D

EDIT 2编辑 2

Also did 1 so it can store in a df.也做了 1,所以它可以存储在 df 中。

likehoodmultinom_p <- function(model_lmm) 
{

  i <- 1

  variables <-c("No funciona")
  values <- c("No funciona") 


  for (var in model_lmm$coefnames[-1]) { 

  variables[i] =paste(var)
  values[i]= lrtest(model_lmm, var)[[5]][2]
  i=i+1
   ## Contributed to stack at: 
  }
  return (data.frame(variables,values))
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R中的多项logit:mlogit与nnet - Multinomial logit in R: mlogit versus nnet 如何计算使用 nnet package 创建的多项 logit model 的边际效应? - How to compute marginal effects of a multinomial logit model created with the nnet package? 来自 R 的 nnet package 的 function multinom() 是否适合多项逻辑回归或泊松回归? - Does the function multinom() from R's nnet package fit a multinomial logistic regression, or a Poisson regression? 使用'bife'包的固定效果logit模型的拟合优度 - Goodness-of-fit for fixed effect logit model using 'bife' package R:用于nnet multinom多项式的Tukey posthoc测试适用于测试多项分布的总体差异 - R: Tukey posthoc tests for nnet multinom multinomial fit to test for overall differences in multinomial distribution 多项式混合logit模型mlogit r-package - multinomial mixed logit model mlogit r-package 在 R 中手工进行多项逻辑回归 - multinomial logit regression by hand in R R中的多项式逻辑回归:nnet程序包中的多项式与mlogit程序包中的mlogit有何不同? - multinomial logistic regression in R: multinom in nnet package result different from mlogit in mlogit package? R中的拟合优度函数 - Goodness of fit functions in R 使用 R 包 RSiena,我如何测试拟合优度? - Using R package RSiena, how do I test goodness of fit?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM