简体   繁体   English

R 中的多元逻辑回归

[英]multivariate logistic regression in R

I want to run a simple multivariate logistic regression.我想运行一个简单的多元逻辑回归。 I made an example below with binary data to talk through an example.我在下面用二进制数据做了一个例子来讨论一个例子。

multivariate regression = trying to predict 2+ outcome variables多元回归 = 试图预测 2+ 个结果变量

> y = matrix(c(0,0,0,1,1,1,1,1,1,0,0,0), nrow=6,ncol=2)

> x = matrix(c(1,0,0,0,0,0,1,1,0,0,0,0,1,1,1,0,0,0,1,1,1,1,0,0,1,1,1,1,1,0,1,1,1,1,1,1), nrow=6,ncol=6)
> x
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    1    1    1    1    1
[2,]    0    1    1    1    1    1
[3,]    0    0    1    1    1    1
[4,]    0    0    0    1    1    1
[5,]    0    0    0    0    1    1
[6,]    0    0    0    0    0    1
> y
     [,1] [,2]
[1,]    0    1
[2,]    0    1
[3,]    0    1
[4,]    1    0
[5,]    1    0
[6,]    1    0

So, variable "x" has 6 samples and each sample has 6 attributes.因此,变量“x”有 6 个样本,每个样本有 6 个属性。 Variable "y" has 2 predictions for each of the 6 samples.变量“y”对 6 个样本中的每一个都有 2 个预测。 I specifically want to work with binary data.我特别想处理二进制数据。

> fit = glm(y~x-1, family = binomial(logit))

I do "-1" to eliminate the intercept coefficient.我做“-1”来消除截距系数。 Everything else is standard logistic regression in a multivariate situation.其他一切都是多变量情况下的标准逻辑回归。

> fit

Call:  glm(formula = y ~ x - 1, family = binomial(logit))

Coefficients:
 data1   data2   data3   data4   data5   data6  
  0.00    0.00  -49.13    0.00    0.00   24.57  

Degrees of Freedom: 6 Total (i.e. Null);  0 Residual
Null Deviance:      8.318 
Residual Deviance: 2.572e-10    AIC: 12

At this point things are starting to look off.在这一点上,事情开始变得不妙了。 I am not sure why the inte.net for data 3 and 6 is what it is.我不确定为什么数据 3 和 6 的 inte.net 是这样的。

val <- predict(fit,data.frame(c(1,1,1,1,1,1)), type = "response")

> val
       1            2            3            4            5            6 
2.143345e-11 2.143345e-11 2.143345e-11 1.000000e+00 1.000000e+00 1.000000e+00 

Logically I am doing something wrong.从逻辑上讲我做错了什么。 I am expecting a 1x2 matrix, not 1x6.我期待一个 1x2 矩阵,而不是 1x6。 I want matrix that tells me the probability of data frame vector being a "1"(true) in y1 and y2.我想要矩阵告诉我数据帧向量在 y1 和 y2 中为“1”(真)的概率。

Any help would be appreciated.任何帮助,将不胜感激。

Note: I updated the ending of my question based on reply from Mario.注意:我根据 Mario 的回复更新了问题的结尾。

Unlike lm , glm does not work with multivariate response variables.lm不同, glm不适用于多元响应变量。 As a workaround, you can fit several GLMs:作为解决方法,您可以安装多个 GLM:

fit1 <- glm(y[,1] ~ x-1, family=binomial(logit))
fit2 <- glm(y[,2] ~ x-1, family=binomial(logit))

Or you can use glmer from the lme4 package, which is meant to model mixed models, but you can simply omit "random effects".或者您可以使用glmer中的glmer ,这意味着 model 混合模型,但您可以简单地省略“随机效应”。 AFAIK, glmer supports multivariate responses. AFAIK, glmer支持多变量响应。

The argument newdata need to be a data.frame. 参数newdata必须为data.frame。 You can do this: 你可以这样做:

aux <- data.frame(c(1,1,1,1,1,1))
val <- predict(fit, aux, type = "response")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM