I ran LDA using the R package topicmodels and I have been trying to get the value for delta which is, in my understanding, the parameter of the dirichlet for words over topics. However, I was not able to access the value. I only managed to get the initial value using
LDA@control@delta
or
slot(LDA@control,"delta")
I know how to get alpha (parameter of Dir for topics over documents) for the posterior distribution, which is simply slot(LDA,"alpha")
but how do get the delta?
Thanks a lot!
topicmodels
uses a list of control parameters for the sampling method, here Gibbs sampling. By default values of alpha = 50/k
and delta = 0.1
are assumed in control_LDA_Gibbs
- you may, of course, specify other values. Maybe you have not specified your controls correctly. In any case, here a short example of code that should information on the deltaprior in the output. I hope that helps and solves your issue.
library(text2vec)
library(topicmodels)
library(slam) #to convert dtm to simple triplet matrix for topicmodels
ntopics <- 10
alphaprior <- 0.1
deltaprior <- 0.001
niter <- 1000
seedpar <- 0
docssubset <- 1:500
docs <- movie_review$review[docssubset]
#Generate document term matrix with text2vec
tokens = docs %>%
tolower %>%
word_tokenizer
it = itoken(tokens, ids = movie_review$id[docssubset], progressbar = FALSE)
vocab = create_vocabulary(it) %>%
prune_vocabulary(term_count_min = 10, doc_proportion_max = 0.2)
vectorizer = vocab_vectorizer(vocab)
dtm = create_dtm(it, vectorizer, type = "dgTMatrix")
control_Gibbs_topicmodels <- list(
alpha = alphaprior
,delta = deltaprior
,iter = niter
,burnin = 100
,keep = 50
,nstart = 1
,best = TRUE
,seed = seedpar
)
ldatopicmodels <- LDA(as.simple_triplet_matrix(dtm)
,k = ntopics
,method = "Gibbs"
,control = control_Gibbs_topicmodels
)
str(ldatopicmodels)
ldatopicmodels@control@delta
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.