简体   繁体   English

PyLDAvis 可视化与生成的主题不一致

[英]PyLDAvis visualisation does not align with generated topics

I am using PyLDAvis to visualise the results of the LDA from Mallet.我正在使用 PyLDAvis 来可视化 Mallet 的 LDA 结果。

Before I can do that, I need the wrapper of the gensim library:在我这样做之前,我需要 gensim 库的包装器:

model = gensim.models.wrappers.ldamallet.malletmodel2ldamodel(model_list[8])

When I print the found topics, they are ordered from 0-10.当我打印找到的主题时,它们的顺序是 0-10。

However when I am using the pyLDAvis to visualise the Topics, the Topic order (0-10), does not align with printed topics.但是,当我使用 pyLDAvis 来可视化主题时,主题顺序 (0-10) 与打印主题不一致。

Example:例子:

(5,
  '0.042*"euro" + 0.030*"smartpho" + 0.022*"camera" + 0.020*"display" + '
  '0.018*"model" + 0.016*"picture" + 0.012*"price" + 0.010*"android"')

As you can see this topic is about smartphones.如您所见,本主题与智能手机有关。

However when I visualise the model with pyLDAvis, Topic 5 is not about smartphones, but about another Topic (cars for example).然而,当我用 pyLDAvis 可视化模型时,主题 5 不是关于智能手机,而是关于另一个主题(例如汽车)。 The smartphone topic is not 5 anymore but topic 1.智能手机的话题不再是 5,而是话题 1。

Example1:示例 1:

在此处输入图片说明

Example2:示例2: 在此处输入图片说明

Is this a known error or is this the normal?这是已知错误还是正常现象? Somebody can help?有人可以帮忙吗?

By default, pyLDAvis sorts the topics by topic proportion -- To keep the original sort order, pass sort_topics=False to pyLDAvis.prepare() .默认情况下,pyLDAvis 按主题比例对主题进行排序——为了保持原始排序顺序,将sort_topics=False传递给pyLDAvis.prepare() Note that the pyLDAvis topics will still be off by one (ie, Topic 1 in pyLDAvis will be Topic 0 from gensim).请注意,pyLDAvis 主题仍将相差 1(即,pyLDAvis 中的主题 1 将是 gensim 中的主题 0)。

There is a similar question here: Is there any way to match Gensim LDA output with topics in pyLDAvis graph?这里有一个类似的问题: 有没有办法将 Gensim LDA 输出与 pyLDAvis 图中的主题相匹配?

And an associated issue on the pyLDAvis repo: https://github.com/bmabey/pyLDAvis/issues/127以及 pyLDAvis 存储库上的相关问题: https : //github.com/bmabey/pyLDAvis/issues/127

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM