What is the scale of parameter estimates produced by nnet::multinom?

I'm using the multinom function from the nnet package to do multinomial logistic regression in R. When I fit the model, I expected to get parameter estimates on the logit scale. However, transforming variables with the inverse logit doesn't give probability estimates that match predicted examples, see example below.

The help file states that "A log-linear model is fitted, with coefficients zero for the first class", but how do I transform parameter estimates to get predicted effects on the probability scale?



# Simulate some simple fake data
groups <- t(rmultinom(500, 1, prob = c(0.05, 0.3, 0.65))) %*% c(1:3)
moddat <- data.frame(group = factor(groups))

# Fit the multinomial model
mod <- multinom(group ~ 1, moddat)
predict(mod, type = "probs")[1,] # predicted probabilities recover generating probs

# But transformed coefficients don't become probabilities
plogis(coef(mod))       # inverse logit
1/(1 + exp(-coef(mod))) # inverse logit

Using predict I can recover the generating probabilities:

   1    2    3 
0.06 0.30 0.64 

But taking the inverse logit of the coefficients does not give probabilities:

2   0.8333333
3   0.9142857

The inverse logit is the correct back transformation for a binomial model. In the case of a multinomial model, the appropriate back transformation is the softmax function, as described in this question .

The statement from the documentation that a "log-linear model is fitted with coefficient zero for the first class" essentially means that the reference probability is set to 0 on the link scale.

To recover the probabilities manually from the example above:


groups <- t(rmultinom(500, 1, prob = c(0.05, 0.3, 0.65))) %*% c(1:3)
moddat <- data.frame(group = factor(groups))
mod <- multinom(group ~ 1, moddat)
# weights:  6 (2 variable)
# initial  value 549.306144 
# final  value 407.810115 
# converged
predict(mod, type = "probs")[1,] # predicted probabilities recover generating probs
#   1    2    3 
# 0.06 0.30 0.64 

# Inverse logit is incorrect
1/(1 + exp(-coef(mod))) # inverse logit
#   (Intercept)
# 2   0.8333333
# 3   0.9142857

# Use softmax transformation instead
 softmax <- function(x){
   expx <- exp(x)
# Add the reference category probability (0 on link scale) and use softmax tranformation
all_coefs <- rbind("1" = 0, coef(mod)) 
#   (Intercept)
# 1        0.06
# 2        0.30
# 3        0.64

