简体   繁体   English

使用glm预测0到1之间的连续变量family = binomial(link ='logit')会产生错误

[英]Using glm to predict continuous variables between 0 and 1 family=binomial(link='logit') gives error

I'm trying to use glm to estimate a logistic regression on a continuous variable between 0 and 1 using the following code, but am getting the attached error: 我正在尝试使用glm使用以下代码来估计0到1之间的连续变量的逻辑回归,但是却出现了错误:

> glm(y ~ x, data=test_data, family=binomial(link = 'logit'))
Error in eval(family$initialize) : y values must be 0 <= y <= 1

However, when I do a summary on test_data, the df has y values that are entirely between 0 and 1... 但是,当我对test_data进行汇总时,df的y值完全在0到1之间...

> summary(test_data)
       y                  x         
 Min.   :0.000000   Min.   :0.0000  
 1st Qu.:0.001510   1st Qu.:0.0000  
 Median :0.003664   Median :1.0000  
 Mean   :0.025847   Mean   :0.5386  
 3rd Qu.:0.009054   3rd Qu.:1.0000  
 Max.   :1.000000   Max.   :1.0000

Can anyone help me understand what the issue here is? 谁能帮助我了解这里的问题? If I check the type of the variables, they are both numeric: 如果我检查变量的类型,它们都是数字:

> class(test_data$y)
[1] "numeric"
> class(test_data$x)
[1] "numeric"

Suggest you try: 建议您尝试:

which(as.numeric(test_data$x) < 0 | as.numeric(test_data$x) > 1)
which(as.numeric(test_data$y) < 0 | as.numeric(test_data$y) > 1)

I found the issue here - after drilling down into the data, there are a small number of rows with very small, negative values of y (likely due to rounding errors), eg,: 我在这里发现了问题-深入研究数据后,只有少量行具有非常小的y负值(可能由于舍入误差),例如:

> test_data[276,]
# A tibble: 1 x 2
          y     x
      <dbl> <dbl>
1 -1.47e-17     0

However, these out-of-range values do not show up in summary. 但是,这些超出范围的值不会在摘要中显示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 predict.glmnet() 使用 family = &quot;binomial&quot; 对 type = &quot;link&quot; 和 &quot;response&quot; 给出相同的预测 - predict.glmnet() gives same predictions for type = "link" and "response" using family = "binomial" 在glm函数中使用二项式时,lazyeval找不到`C_logit_link` - lazyeval not finding `C_logit_link` when using binomial in the glm function 使用glm和二项式族更改参考组 - Change reference group using glm with binomial family 在R中,将glm函数与二项式族一起使用时默认使用的链接函数是什么 - In R, what is the default link function used when using the glm function with binomial family 不能用家庭二项式和日志链接拟合 GLM 来估计 RR - can't fit GLM with family binomial and log link to estimate RR 如何绘制具有连续变量和分类变量的二项式GLM的预测 - How to plot predictions of binomial GLM that has both continuous and categorical variables geeglm ((binomial(link=&quot;logit&quot;)) 使用 glht(multcomp 包)的对比 - Contrasts for geeglm ((binomial(link="logit") using glht (multcomp package) 效果包:使用GLM二项式模型时出错 - Effects package: Error using GLM binomial model 在 R 中使用多个变量预测 glm 的可视化 - Visualization of predict glm using multiple variables in R 使用 tidymodels 的 GLM 系列 - GLM Family using tidymodels
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM