简体   繁体   中英

How to plot sigmoidal data in R - binary Y continuous X ggplot mixed effects logisitic regression

Here's the data I'm working with:

data <- data.frame(id = rep(1:3, each = 30), 
           intervention = rep(c("a","b"),each= 2, times=45),
           area = rep(1:3, times=30), 
           "dv1" = rnorm(90, mean =10, sd=7),
           "dv2" = rnorm(90, mean =5, sd=3),
           outcome = rbinom(90, 1, prob=.5))

data$id <- as.factor(data$id)
data$intervention <- as.factor(data$intervention)
data$area <- as.factor(data$area)
data$outcome <- as.factor(data$outcome)

I'm trying to make sigmoidal plots for this mixed effects logistic regression model:

library(lmer4)
glmer(
  outcome1 ~ dv1 + (1 | id/area), 
  data = data, 
  family = binomial(link = "logit")
)

Here's what I tried and failed with:

library(ggplot2)
ggplot(data, aes(x=dv1, y=outcome1, color=factor(area))) + 
  facet_wrap(~id) +
  geom_point() + 
  stat_smooth(method="glm", method.args=list(family="binomial"), color="black", se=F)

Info    
`geom_smooth()` using formula 'y ~ x'
Warning 
Computation failed in `stat_smooth()`: y values must be 0 <= y <= 1
Computation failed in `stat_smooth()`: y values must be 0 <= y <= 1
Computation failed in `stat_smooth()`: y values must be 0 <= y <= 1

在此处输入图像描述

Additionally, is this even the right way to plot logistic regression? Should I be pulling some data from the model itself or is plotting the raw data for illustrative reasons suffice?

It looks ok to me (not an expert) - I think the issue is that your sample data isn't particularly 'logistic' (ie the spread of dv1 isn't logistically related to outcome). If you modify the sample data, eg

library(tidyverse)
#install.packages("lme4")
library(lme4)
set.seed(123)
data <- data.frame(id = rep(1:3, each = 30), 
                   intervention = rep(c("a","b"), each= 2, times=45),
                   area = rep(1:3, times = 30), 
                   "dv1" = rep(c(rnorm(15, mean = 20, sd = 7),
                                 rnorm(15, mean = 40, sd = 7)), times = 3),
                   "dv2" = rep(c(rnorm(15, mean = 20, sd = 7),
                                 rnorm(15, mean = 40, sd = 7)), times = 3),
                   outcome = rep(c(rbinom(15, 0, prob = .95),
                                   rbinom(15, 1, prob = .95)), times = 3))

data$id <- as.factor(data$id)
data$intervention <- as.factor(data$intervention)
data$area <- as.factor(data$area)
data$outcome <- as.factor(data$outcome)

model_1 <- glmer(
  outcome ~ dv1 + (1 | id/area), 
  data = data, 
  family = binomial(link = "logit")
)

library(ggplot2)
ggplot(data, aes(x = dv1, y = as.numeric(outcome) - 1, color = factor(area))) +
  stat_smooth(method="glm", color="black", se=FALSE,
              method.args = list(family=binomial)) + 
  geom_point() +
  facet_wrap(~id)

The plot looks more like you'd expect:

示例_1.png

(Note: these three panels are the same, as I repeated the sample data 3 times, but you get the idea)

If you want to plot the model predictions, this tutorial gives a straightforward overview: https://mgimond.github.io/Stats-in-R/Logistic.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM