简体   繁体   中英

How to get confidence intervals for predicted probability plot using {ggeffects} based on nnet::multinom() model?

I want to plot the predicted probabilities for a multinomial model in R, fitted with the nnet::multinom() function. I have numerical predictors on the log scale.

Even though {ggeffects} should be compatible with multinom() , the plot does not display confidence intervals the same way it does for linear models.

I am new to using to R and to this community, so my apologies if this question is very basic or is missing something essential. Here is a small example:

library(tidyverse)
library(nnet)
library(effects)
library(ggeffects)


df <- data.frame(response = c("1 Better", "1 Better", "1 Better", "2 Medium", "2 Medium", "2 Medium", "3 Worse", "3 Worse", "3 Worse"),
                 count = c(1000, 2000, 4000, 6000, 10000, 3000, 6000, 5000, 11000))

mod1 <- multinom(response ~ log(count), data = df)
summary(mod1)

effects::effect(mod1, term="log(count)", se=TRUE, confidence.level=.95) %>% plot() # Produces CIs.

ggeffects::ggpredict(mod1, terms = "count") %>% plot() + theme_bw() # No confidence intervals.

If others are looking for alternatives to {ggeffects} , I tried these approaches while looking for a solution:

Using effects::effect() : works, confidence intervals are included but the appearance isn't so customisable.

Combining {ggeffects} and {effects} : See this post on R Studio Community in which confidence intervals from the effects package were combined with ggeffects to create a plot. I got the error

Error in FUN(X[[i]], ...) : object 'L' not found

But it worked for that person.

Using {MNLpred} package and its mnl_pred_ova() : didn't work for me, I think because my predictors are in log scale. I got the following error:

Error in eval(parse(text = paste0("data$", xvari))) : attempt to apply non-function

Using mnlAveEffPlot() function from {DAMisc} : Worked but the plots aren't as customisable as I would like.

You can do this using ggeffects::ggemmeans() .

library(tidyverse)
library(ggthemes)
library(nnet)
library(ggeffects) # package version used: v0.16.0

df <- data.frame(response = c("1 Better", "1 Better", "1 Better", "2 Medium", "2 Medium", "2 Medium", "3 Worse", "3 Worse", "3 Worse"),
                 count = c(1000, 2000, 4000, 6000, 10000, 3000, 6000, 5000, 11000))

mod1 <- multinom(response ~ log(count),
                 data = df)

ggemmeans(mod1, terms = "count") %>% plot() + ggthemes::theme_tufte()

For more information on how to use {ggeffects}, you may also want to take a look at thepackage documentation and especially at the differences between the ggemmeans() and ggpredict() etc. (eg here ).

The {ggeffects} package draws on the output created by {effects} but, and I believe this is what you are looking for, makes it much easier to customize the plot using standard ggplot commands.

The MNLpred package cannot handle the log() inside the regression function but works when you compute the log-scale beforehand.

# Packages
library(tidyverse)
library(nnet)
library(MASS)
library(MNLpred)
library(scales)
library(ggeffects)
library(ggthemes)

df <- data.frame(response = c("1 Better", "1 Better", "1 Better", 
                              "2 Medium", "2 Medium", "2 Medium", 
                              "3 Worse", "3 Worse", "3 Worse"),
                 count = c(1000, 2000, 4000, 
                           6000, 10000, 3000, 
                           6000, 5000, 11000))

mod1 <- multinom(response ~ log(count), data = df)
summary(mod1)

# Log-scaled
df$count_log <- log(df$count)

# Regression
mod2 <- multinom(response ~ count_log,
                 data = df,
                 Hess = TRUE)

# The models are identical:
coef(mod1) == coef(mod2)

After this step, you can use the mnl_pred_ova or the mnl_fd2_ova function for predicted probabilities or first differences/predicted marginal effects.

# 10 steps for predictions
steps <- (max(df$count_log) - min(df$count_log))/9

pred1 <- mnl_pred_ova(mod2,
                      data = df,
                      by = steps,
                      x = "count_log")

x_breaks <- seq(from = min(df$count_log),
                to = max(df$count_log), 
                length.out = 5)

x_labels <- seq(from = min(df$count),
                to = max(df$count),
                length.out = 5)

pred1$plotdata %>%
  ggplot(aes(x = count_log,
             y = mean,
             ymin = lower,
             ymax = upper)) +
  facet_wrap(. ~ response) +
  geom_line() +
  geom_ribbon(alpha = 0.2) +
  scale_y_continuous(labels = percent_format()) +
  scale_x_continuous(breaks = x_breaks,
                     labels = x_labels) +
  theme_bw()

绘制三个类别的预测概率图

Or the predicted marginal effects:

pred_fd <- mnl_fd2_ova(model = mod2,
                       x = "count_log",
                       value1 = min(df$count_log),
                       value2 = max(df$count_log),
                       data = df)

pred_fd$plotdata_fd %>%
  ggplot(aes(x = categories,
             y = mean,
             ymin = lower,
             ymax = upper)) +
  geom_pointrange() +
  scale_y_continuous(labels = percent_format()) +
  labs(title = "Predicted effect of Count on responses",
       x = "Categories",
       y = "Predicted marginal effect") +
  theme_bw()

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM