简体   繁体   English

带有用于R的topicmodels软件包的LDA,如何获得每个术语的主题概率?

[英]LDA with topicmodels package for R, how do I get the topic probability for each term?

I'm using the topicmodels package for LDA. 我正在为LDA使用topicmodels包。 I would like to create a visualization that shows how related or non-related each topic is. 我想创建一个可视化效果,以显示每个主题的相关性或不相关性。 I envision a cluster of words that are unique to topic 1, but with a few keywords that are shared connecting to another topic. 我设想了一个主题1特有的词簇,但有几个共享给另一个主题的关键词。 Any advice here would be great. 这里的任何建议都很好。 To continue: 接着说:

To do this, I need to know the each term probability to each topic. 为此,我需要了解每个主题的每个术语概率。 How do I get this with the topicmodels package? 如何通过topicmodels软件包获取此信息? I can view the terms with: 我可以通过以下方式查看条款:

terms(LDAmodel, 15)

But I don't know how to get values. 但是我不知道如何获得价值。 Ideas? 有想法吗?

You can use posterior()$terms to get the posterior probability for each term. 您可以使用posterior()$terms获得每个术语的后验概率。 posterior()$topics gives the probability for documents. posterior()$topics给出文档的概率。

Example adapted from help(LDA) : help(LDA)改编的示例:

data("AssociatedPress", package = "topicmodels")
lda <- LDA(AssociatedPress[1:20,], k = 2)
terms <- posterior(lda)$terms

## posterior probability for the first 5 terms (alphabetically)
terms[,1:5]
         aaron      abandon    abandoned   abandoning       abbott
1 3.720076e-44 3.720076e-44 3.720076e-44 3.720076e-44 3.720076e-44
2 3.720076e-44 3.720076e-44 3.720076e-44 3.720076e-44 3.720076e-44

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM