[英]Assesing the goodness of fit for the multinomial logit in R with the nnet package
I use the multinom()
function from the nnet package to run the multinomial logistic regression in R. The nnet package does not include p-value calculation and t-statistic calculation.我使用 nnet 包中的
multinom()
函数在 R 中运行多项逻辑回归。 nnet 包不包括 p 值计算和 t 统计量计算。 I found a way to calculate the p-values using the two tailed z-test from this page .我找到了一种使用this page中的两个tailed z-test计算p值的方法。 To give one example of calculating a test statistic for a multinom logit (not really a t-stat, but an equivalent) I calculate the Wald's statistic:
举一个计算多项式 logit 的检验统计量的例子(不是真正的 t-stat,而是等价的),我计算了 Wald 统计量:
mm<-multinom(Empst ~ Agegroup + Marst + Education + State,
data = temp,weight=Weight)
W <- (summary(mm1)$coefficients)^2/(summary(mm1)$standard.errors)^2
I take the square of a coefficient and divide by the square of the coefficient's standard error.我取系数的平方并除以系数的标准误差的平方。 However, the likelihood-ratio test is the preferable measure of a goodness of fit for the logistic regressions.
然而,似然比检验是衡量逻辑回归拟合优度的首选方法。 I do not know how to write code that will calculate the likelihood ratio statistic for each coefficient due to the incomplete understanding of the likelihood function.
由于对似然函数的理解不完整,我不知道如何编写代码来计算每个系数的似然比统计量。 What would be the way to calculate the likelihood-ratio statistic for each coefficient using the output from the
multinom()
function?使用
multinom()
函数的输出计算每个系数的似然比统计量的方法是什么? Thanks for your help.谢谢你的帮助。
Let's look at predicting Sepal.Length
from the iris dataset using Species
(a categorical variable) and Petal.Length
(a continuous variable).让我们看看使用
Species
(一个分类变量)和Petal.Length
(一个连续变量)从 iris 数据集预测Sepal.Length
。 Let's start by converting our factor variable into multiple binary variables using model.matrix
and building our neural network:让我们首先使用
model.matrix
将我们的因子变量转换为多个二元变量并构建我们的神经网络:
library(nnet)
data(iris)
mat <- as.data.frame(model.matrix(~Species+Petal.Length+Sepal.Length, data=iris))
mm <- multinom(Sepal.Length~.+0, data=mat, trace=F)
Now we can run a likelihood ratio test for a variable in our model:现在我们可以对模型中的变量进行似然比检验:
library(lmtest)
lrtest(mm, "Speciesversicolor")
# Likelihood ratio test
#
# Model 1: Sepal.Length ~ `(Intercept)` + Speciesversicolor + Speciesvirginica +
# Petal.Length + 0
# Model 2: Sepal.Length ~ `(Intercept)` + Speciesvirginica + Petal.Length -
# 1
# #Df LogLik Df Chisq Pr(>Chisq)
# 1 136 -342.02
# 2 102 -346.75 -34 9.4592 1
To run the likelihood ratio test for all your variables, I guess you could just use a loop and run for each variable name.要对所有变量运行似然比测试,我想您可以只使用循环并为每个变量名称运行。 I've extracted just the p-values in this loop.
我只提取了这个循环中的 p 值。
for (var in mm$coefnames[-1]) {
print(paste(var, "--", lrtest(mm, var)[[5]][2]))
}
# [1] "Speciesversicolor -- 0.999990077592342"
# [1] "Speciesvirginica -- 0.998742545590864"
# [1] "Petal.Length -- 3.36995663002528e-14"
Use the Anova
function in the car
package for the likelihood-ratio test of each term in your model.使用
car
包中的Anova
函数对模型中的每一项进行似然比检验。
library(nnet)
data(iris)
mm <- multinom(Species ~ ., data=iris, trace=F)
### car package
library(car)
Anova(mm)
From the response of @jolisber i extracted a function so anyone can do this and store the values in a df.从@jolisber 的响应中,我提取了一个函数,以便任何人都可以执行此操作并将值存储在 df 中。 Well, i stored the full character vector in the df.
好吧,我将完整的字符向量存储在 df 中。
likehoodmultinom2 <- function(model_lmm)
{
i <- 1
values<- c("No funciona")
for (var in model_lmm$coefnames[-1]) { # Qutiamos el -1 de coefnames para no obener un NA
values[i] =(paste(var, "--", lrtest(model_lmm, var)[[5]][2]))
i=i+1
}
return (values)
}
However i cant get the first element (variable) p-value.但是我无法获得第一个元素(变量)p 值。 I dont know why.
我不知道为什么。 And i cant ignore the [-1] in model_lmm$coefnames.
而且我不能忽略 model_lmm$coefnames 中的 [-1]。 EDITED.
已编辑。 I edited i=0 to i=1;
我将 i=0 编辑为 i=1; forgot that R vectors start at that :D.
忘记了 R 向量从那个开始:D。
Hope this works for everyone :D希望这对每个人都有效:D
EDIT 2编辑 2
Also did 1 so it can store in a df.也做了 1,所以它可以存储在 df 中。
likehoodmultinom_p <- function(model_lmm)
{
i <- 1
variables <-c("No funciona")
values <- c("No funciona")
for (var in model_lmm$coefnames[-1]) {
variables[i] =paste(var)
values[i]= lrtest(model_lmm, var)[[5]][2]
i=i+1
## Contributed to stack at:
}
return (data.frame(variables,values))
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.