multivariate logistic regression in R

Question

I want to run a simple multivariate logistic regression. I made an example below with binary data to talk through an example.

multivariate regression = trying to predict 2+ outcome variables

> y = matrix(c(0,0,0,1,1,1,1,1,1,0,0,0), nrow=6,ncol=2)

> x = matrix(c(1,0,0,0,0,0,1,1,0,0,0,0,1,1,1,0,0,0,1,1,1,1,0,0,1,1,1,1,1,0,1,1,1,1,1,1), nrow=6,ncol=6)
> x
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    1    1    1    1    1
[2,]    0    1    1    1    1    1
[3,]    0    0    1    1    1    1
[4,]    0    0    0    1    1    1
[5,]    0    0    0    0    1    1
[6,]    0    0    0    0    0    1
> y
     [,1] [,2]
[1,]    0    1
[2,]    0    1
[3,]    0    1
[4,]    1    0
[5,]    1    0
[6,]    1    0

So, variable "x" has 6 samples and each sample has 6 attributes. Variable "y" has 2 predictions for each of the 6 samples. I specifically want to work with binary data.

> fit = glm(y~x-1, family = binomial(logit))

I do "-1" to eliminate the intercept coefficient. Everything else is standard logistic regression in a multivariate situation.

> fit

Call:  glm(formula = y ~ x - 1, family = binomial(logit))

Coefficients:
 data1   data2   data3   data4   data5   data6  
  0.00    0.00  -49.13    0.00    0.00   24.57  

Degrees of Freedom: 6 Total (i.e. Null);  0 Residual
Null Deviance:      8.318 
Residual Deviance: 2.572e-10    AIC: 12

At this point things are starting to look off. I am not sure why the inte.net for data 3 and 6 is what it is.

val <- predict(fit,data.frame(c(1,1,1,1,1,1)), type = "response")

> val
       1            2            3            4            5            6 
2.143345e-11 2.143345e-11 2.143345e-11 1.000000e+00 1.000000e+00 1.000000e+00

Logically I am doing something wrong. I am expecting a 1x2 matrix, not 1x6. I want matrix that tells me the probability of data frame vector being a "1"(true) in y1 and y2.

Any help would be appreciated.

Note: I updated the ending of my question based on reply from Mario.

Answer 1

Unlike lm , glm does not work with multivariate response variables. As a workaround, you can fit several GLMs:

fit1 <- glm(y[,1] ~ x-1, family=binomial(logit))
fit2 <- glm(y[,2] ~ x-1, family=binomial(logit))

Or you can use glmer from the lme4 package, which is meant to model mixed models, but you can simply omit "random effects". AFAIK, glmer supports multivariate responses.

Answer 2

The argument newdata need to be a data.frame. You can do this:

aux <- data.frame(c(1,1,1,1,1,1))
val <- predict(fit, aux, type = "response")

multivariate logistic regression in R

Question

2 answers

solution1
1 2022-02-15 14:25:36

solution2
0 2017-03-09 17:10:42

multivariate logistic regression in R

Question

2 answers

solution1 1 2022-02-15 14:25:36

solution2 0 2017-03-09 17:10:42

solution1
1 2022-02-15 14:25:36

solution2
0 2017-03-09 17:10:42