简体   繁体   English

是否可以使用Python绘制R glmer模型预测?

[英]Is it possible to plot R glmer model predictions using Python?

I have an glmer model in R which I want to plot predictions for. 我在R中有一个glmer模型,我想为其绘制预测。 I found the plot_model function from the sjPlot library and it works fine. 我从sjPlot库中找到了plot_model函数,并且工作正常。

Here is a MWE: 这是MWE:

library(lattice)

cbpp$response <- sample(c(0,1), replace=TRUE, size=nrow(cbpp))
gm1 <- glmer(response ~ size + incidence + (1 | herd),
              data = cbpp, family = binomial)

For example, calling plot_model(gm1, type = "pred", show.data = TRUE) yields the following figure: 例如,调用plot_model(gm1, type = "pred", show.data = TRUE)产生下图:

在此处输入图片说明

However, I am not familiar with R and I am having a hard time trying to control the plot aesthetics and plotting multiple models into the same figure (already asked a question regarding that issue here ). 不过,我不熟悉R和我有一个很难试图控制的情节美学和多个模型绘制成数字相同(已经问关于这个问题一个问题在这里 )。 I am familiar with Python and matplotlib and getting these figures to work on a Python environment would be much simpler for me. 我熟悉Python和matplotlib,让这些数据在Python环境下工作对我来说要简单得多。

I'm guessing one way to accomplish this would be taking the y values (predicted probabilities of fire) from R and exporting them so I could read them in Python in order to plot them against each covariate (evi prev) in this example. 我猜想一种实现此目的的方法是从R中获取y值(预测的开火概率)并导出它们,以便在此示例中可以在Python中读取它们以便针对每个协变量(evi prev)绘制它们。 However, I am not sure how to do this. 但是,我不确定如何执行此操作。 Furthermore, I tried to read sjPlot source code to figure out how it plots the predictions but could not figure it out either. 此外,我尝试阅读sjPlot源代码以弄清楚它如何绘制预测,但也无法弄清楚。

The easiest way to do this is probably with ggeffects::ggpredict() . 最简单的方法可能是使用ggeffects::ggpredict()

Something like 就像是

library(ggeffects)
pred_frame <- ggpredict(myModel, term="evi_prev")

should produce a data frame with predictions, lower and upper confidence levels. 应该产生一个带有预测,较低和较高置信度的数据框。 I'm not sure whether it will make the predictions for evenly spaced values along the x-axis (which would be nice), or how to trick it into doing so. 我不确定是否会预测沿x轴均匀分布的值(这会很好),或者如何欺骗它。 (If you provide a reproducible example I might give it a shot.) (如果您提供了可复制的示例,那么我可以试一下。)

Playing around with the MWE you posted does suggest that it's hard to get predictions for evenly spaced values (or more generally, for values that aren't in the original data); 玩弄您发布的MWE确实表明,很难对均匀分布的值(或更普遍地说,对于不在原始数据中的值)进行预测。 I tried things like terms="size [1:35]" , but this restricts the range of the predicted values rather than filling them in. 我尝试了诸如terms="size [1:35]"类的事情,但这限制了预测值的范围 ,而不是填充它们。

More basically, the built-in predict() method for merMod objects can be used (possibly with newdata to specify eg evenly spaced values) to get predictions [use type="response" to get predictions on the probability rather than the log-odds scale]; 基本上,可以使用针对merMod对象的内置predict()方法(可能与newdata一起指定例如均匀间隔的值)来获取预测(使用type="response"来获取关于概率而不是对数的预测)规模]; confidence intervals are harder but can be generated with the recipe shown here 置信区间较难,但可以通过此处显示的配方生成

ggpredict() actually returns more values (and along the x-axis, ie for the term in question - size in your example - these are even-spaced), but only prints fewer values. ggpredict()实际上返回多个值(和沿x轴,即有问题的名词- size在你的榜样-这是等间隔),但只打印更少的值。

library(lme4)
#> Loading required package: Matrix
library(ggeffects)

cbpp$response <- sample(c(0,1), replace=TRUE, size=nrow(cbpp))
gm1 <- glmer(response ~ size + incidence + (1 | herd), data = cbpp, family = binomial)

pr1 <- ggpredict(gm1, term = "size")

pr1
#> 
#> # Predicted probabilities of response
#> # x = size
#> 
#>   x predicted std.error conf.low conf.high
#>   2     0.632     0.717    0.297     0.875
#>   6     0.610     0.550    0.347     0.821
#>  10     0.587     0.407    0.390     0.759
#>  14     0.563     0.321    0.407     0.708
#>  18     0.539     0.339    0.376     0.695
#>  22     0.515     0.448    0.306     0.719
#>  26     0.491     0.601    0.229     0.758
#>  34     0.444     0.951    0.110     0.837
#> 
#> Adjusted for:
#> * incidence = 1.77
#> *      herd = 0 (population-level)
#> Standard errors are on link-scale (untransformed).

as.data.frame(pr1)
#>     x predicted std.error  conf.low conf.high group
#> 1   2 0.6323758 0.7168742 0.2967912 0.8751705     1
#> 2   4 0.6211339 0.6316777 0.3221952 0.8497229     1
#> 3   6 0.6097603 0.5501862 0.3470481 0.8212222     1
#> 4   8 0.5982662 0.4743133 0.3701925 0.7904902     1
#> 5  10 0.5866630 0.4072118 0.3898523 0.7592017     1
#> 6  12 0.5749627 0.3539066 0.4033525 0.7302266     1
#> 7  14 0.5631779 0.3213384 0.4071542 0.7076259     1
#> 8  16 0.5513213 0.3159857 0.3981187 0.6953669     1
#> 9  18 0.5394060 0.3391396 0.3759558 0.6947993     1
#> 10 20 0.5274456 0.3857000 0.3438768 0.7038817     1
#> 11 22 0.5154536 0.4484344 0.3063836 0.7192510     1
#> 12 24 0.5034437 0.5215385 0.2672889 0.7380720     1
#> 13 26 0.4914299 0.6012416 0.2292244 0.7584368     1
#> 14 28 0.4794260 0.6852450 0.1938167 0.7791488     1
#> 15 30 0.4674458 0.7721464 0.1619513 0.7994688     1
#> 16 32 0.4555030 0.8610687 0.1339908 0.8189431     1
#> 17 34 0.4436111 0.9514457 0.1099435 0.8373008     1

Created on 2019-05-06 by the reprex package (v0.2.1) reprex软件包 (v0.2.1)创建于2019-05-06

There are some vignettes that show the different features of the package, this one here demonstrates how to compute marginal effects at specific values / levels of focal terms. 有一些插图说明了程序包的不同功能, 这里的这个示例演示了如何在焦点术语的特定值/水平上计算边际效应。

The recipe posted by Ben that shows how to calculate the confidence intervals (conditioned or not conditioned on random effects) is implemented in ggpredict() , a short vignette explaining the differences is here . 这说明了如何计算置信区间(空调或空调不随机效应)的实施发表本配方ggpredict()短小品解释的差异就在这里

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 python 的多处理并行化 keras 中的模型预测 - Parallelizing model predictions in keras using multiprocessing for python 使用Seaborn和Statsmodels在一个图中显示数据和模型预测 - Showing data and model predictions in one plot using Seaborn and Statsmodels 使用 python 中的统计数据计算 ANN 预测的 R 值 - Calculating R value for ANN predictions using stats in python 为什么模型预测没有在 tensorflow (python) 中更新? - Why are model predictions not updating in tensorflow (python)? 提高机器学习的准确性 Python 中的 model 预测 - Improving accuracy of machine learning model predictions in Python Plot model 多标签分类的所有预测 - Plot all predictions of the model multi-label classification 使用 GCN 预测错误,但模型准确度高 - Bad predictions but good model accuracy using GCN 使用model.predict()的错误预测 - Wrong predictions using model.predict() 数据标准化后如何使用 K-最近邻 (KNN) model 进行预测 (Python) - How to make predictions using K-Nearest Neighbors (KNN) model when data has been normalized (Python) 使用现有 Tensorflow 模型的预测问题 - Issue with predictions using an existing Tensorflow Model
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM