简体   繁体   中英

How to extract or predict latent class membership in gmnl?

Let's say you run the example for a latent class model from ?gmnl :

library(mlogit)
library(gmnl)

## Examples using the Electricity data set from the mlogit package
data("Electricity", package = "mlogit")
Electr <- mlogit.data(Electricity, id.var = "id", choice = "choice",
                      varying = 3:26, shape = "wide", sep = "")

## Estimate a LC model with 2 classes
Elec.lc <- gmnl(choice ~ pf + cl + loc + wk + tod + seas| 0 | 0 | 0 | 1,
                data = Electr,
                subset = 1:3000,
                model = 'lc',
                panel = TRUE,
                Q = 2)
summary(Elec.lc)

You get a fitted model with coefficient estimates for two classes (class 1 & 2). Is there a way to extract (or predict) for each observation, what the most likely class is that this observation belongs to?

After several helpful comments and lots of digging, it seems that there is an undocumented feature that allows you to get predicted class probabilities, which are stored in Wnq . You get one entry per observation and the number of columns matches the number of latent classes ( Q = 2 from above), and entries sum to 1 .

## Get class probabilities
head(Elec.lc$Wnq)
          init          
[1,] 0.5547805 0.4452195
[2,] 0.5547805 0.4452195
[3,] 0.5547805 0.4452195
[4,] 0.5547805 0.4452195
[5,] 0.5547805 0.4452195
[6,] 0.5547805 0.4452195

The fitted model contains a matrix called prob.alt which gives the probability of each choice, so you can do:

predictions <- apply(Elec.cor$prob.alt,1,  which.max)

predictions
#>   [1] 1 1 2 3 1 4 4 3 3 3 2 1 2 2 3 1 1 1 2 3 4 4 4 1 1 4 1 1 4 4 4 2 4 3 1 2 4
#>  [38] 4 4 1 1 4 1 1 4 4 4 2 1 1 2 3 4 4 4 2 4 3 4 2 1 4 2 2 2 2 4 2 1 3 4 3 4 4
#>  [75] 4 1 4 2 3 2 2 1 3 3 4 3 4 1 1 4 2 1 4 4 2 2 2 2 2 2 1 4 2 2 2 2 1 2 2 4 3
#> [112] 1 1 1 2 3 4 4 4 2 4 3 4 1 1 4 2 1 4 4 2 2 1 4 2 2 2 2 1 2 1 2 4 3 2 2 2 2
#> [149] 1 4 2 2 2 1 2 1 4 3 2 2 2 1 2 1 1 4 2 1 4 2 2 2 2 1 2 1 1 4 3 2 2 2 2 1 4
#> [186] 2 2 2 2 4 2 1 4 3 2 2 2 2 2 1 1 4 2 1 4 4 3 2 2 4 4 1 3 4 1 2 4 3 1 1 1 2
#> [223] 3 4 4 4 1 2 4 2 3 4 4 1 3 4 2 3 3 2 4 1 1 4 4 4 2 1 3 1 2 1 1 2 3 1 4 4 2
#> [260] 4 3 2 1 2 4 2 3 3 4 1 3 4 2 3 3 4 4 4 4 4 1 3 2 3 1 3 3 1 4 2 1 4 4 2 2 1
#> [297] 3 1 1 4 2 4 1 2 4 1 1 4 4 4 2 1 1 2 3 4 4 4 2 4 3 4 1 1 1 2 3 1 4 4 3 4 3
#> [334] 2 1 1 4 1 1 4 4 2 2 1 3 1 3 1 4 2 2 2 2 1 2 1 3 4 3 2 2 2 2 1 4 3 2 2 2 1
#> [371] 2 4 4 1 3 4 2 3 3 2 1 3 3 3 3 4 1 1 4 1 1 4 4 2 2 2 4 2 3 4 4 4 1 4 2 3 2
#> [408] 1 4 3 2 2 2 1 2 1 1 4 3 1 1 2 3 4 4 4 3 3 3 2 1 2 4 3 4 4 4 3 4 3 4 3 4 1
#> [445] 1 4 1 1 4 4 4 2 1 4 2 2 2 2 1 2 1 3 4 3 1 4 2 2 2 2 1 2 4 2 4 3 3 3 4 1 1
#> [482] 4 2 1 4 4 2 2 2 2 3 1 1 1 2 3 4 4 4 2 2 4 2 3 4 4 4 3 4 2 3 2 2 4 2 3 4 4
#> [519] 1 1 4 2 3 2 2 4 1 1 4 4 4 2 2 3 1 3 2 1 2 2 1 4 4 2 2 2 4 2 1 4 3 2 2 2 4
#> [556] 2 1 1 4 2 1 4 2 2 2 2 1 2 1 2 4 3 1 1 2 3 4 4 4 2 4 3 4 2 4 4 4 3 4 2 3 3
#> [593] 3 1 3 3 1 1 2 3 1 4 4 3 4 3 2 1 2 2 2 2 1 4 3 2 2 2 2 2 2 4 2 3 3 4 1 3 4
#> [630] 2 3 3 2 3 1 1 4 4 4 2 2 3 1 3 1 1 2 3 1 4 4 3 3 3 4 1 4 4 4 3 4 1 4 3 1 1
#> [667] 3 3 2 2 3 1 1 1 2 3 1 4 4 2 1 4 2 2 2 2 1 2 1 1 4 2 1 1 2 3 4 4 4 2 4 3 4
#> [704] 1 2 2 2 2 1 4 2 2 2 2 4 2 2 2 2 2 1 4 3 2 2 2 4 2 1 4 2 2 2 2 4 2 1 3 4 3
#> [741] 1 4 3 2 2 2 2 2 1 1

If we compare these predictions to the actual choice, we see that the prediction is correct about 50% of the time (the values in the diagonal are correct):

table(predictions, Electricity$choice[1:750])
#>            
#> predictions   1   2   3   4
#>           1  78  35  28  32
#>           2  40 129  40  33
#>           3  16  27  57  24
#>           4  27  36  38 110

Created on 2022-08-06 by the reprex package (v2.0.1)

I have a feeling that this object Wnq is not class membership probabilities though. Even in your example above, when calling Elec.lc$Wnq , you seem to have obtained a list of probabilities of class membership for your individuals, but critically they are all equal across individuals . When looking for this I also found myself with the same problem. I think Elec.lc$Wnq is just the mean of class membership probabilities. I have not looked throughly in the gmnl code, but I think the object Qir is what you should look for?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM