用ggplot2绘制黄土和glm

Question

我试图使用来自泰坦尼克号的数据，将二元选择glm的模型预测与经验概率作图。 为了显示阶级和性别之间的差异，我使用了构面，但是我有两件事我不太清楚。 第一个是我想将黄土曲线限制在0到1之间，但是如果我在图的ylim(c(0,1))添加ylim(c(0,1))选项，则黄土曲线周围的色带会被剪切如果它的一侧超出范围，则关闭。 我想做的第二件事是从每个构面的最小x值（从glm预测的概率）到最大x值（在同一构面内）绘制一条线，并且y = 1以显示glm预测概率。

#info on this data http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3info.txt
load(url('http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.sav'))
titanic <- titanic3[ ,-c(3,8:14)]; rm(titanic3)
titanic <- na.omit(titanic) #probably missing completely at random
titanic$age <- as.numeric(titanic$age)
titanic$sibsp <- as.integer(titanic$sibsp)
titanic$survived <- as.integer(titanic$survived)

training.df <- titanic[sample(nrow(titanic), nrow(titanic) / 2), ]
validation.df <- titanic[!(row.names(titanic) %in% row.names(training.df)), ]


glm.fit <- glm(survived ~ sex + sibsp + age + I(age^2) + factor(pclass) + sibsp:sex,
               family = binomial(link = "probit"), data = training.df)

glm.predict <- predict(glm.fit, newdata = validation.df, se.fit = TRUE, type = "response")

plot.data <- data.frame(mean = glm.predict$fit, response = validation.df$survived,
                        class = validation.df$pclass, sex = validation.df$sex)

require(ggplot2)
ggplot(data = plot.data, aes(x = as.numeric(mean), y = as.integer(response))) + geom_point() +
       stat_smooth(method = "loess", formula = y ~ x) +
       facet_wrap( ~ class + sex, scale = "free") + ylim(c(0,1)) + 
       xlab("Predicted Probability of Survival") + ylab("Empirical Survival Rate")

Answer 1

第一个问题的答案是使用coord_cartesian(ylim=c(0,1))代替ylim(0,1) ; 这是一个适度的常见问题解答。

对于您的第二个问题，也许可以在ggplot中进行处理，但是对我来说，在外部汇总数据更容易：

g0 <- ggplot(data = plot.data, aes(x = mean, y = response)) + geom_point() +
             stat_smooth(method = "loess") +
             facet_wrap( ~ class + sex, scale = "free") + 
             coord_cartesian(ylim=c(0,1))+
             labs(x="Predicted Probability of Survival",
                  y="Empirical Survival Rate")

（通过消除一些默认值并使用labs我略微缩短了代码。）

ss <- ddply(plot.data,c("class","sex"),summarise,minx=min(mean),maxx=max(mean))
g0 + geom_segment(data=ss,aes(x=minx,y=minx,xend=maxx,yend=maxx),
                  colour="red",alpha=0.5)

用ggplot2绘制黄土和glm

问题描述

1 个解决方案

解决方案1
2 已采纳 2013-01-12 16:46:46

用ggplot2绘制黄土和glm

问题描述

1 个解决方案

解决方案1 2 已采纳 2013-01-12 16:46:46

解决方案1
2 已采纳 2013-01-12 16:46:46