简体   繁体   English

如何使用gensim包使用一组种子词初始化LDA

[英]How to initialize LDA with a set of seed words using gensim package

I have read several papers in which they actually initialize parameteres using a seed set of words for LDA. 我已经读了几篇论文,其中他们实际上使用LDA的一组单词来初始化参数。 does anyone know how is this possible in gensim package? 有谁知道gensim包中怎么可能?

For the sake of completeness, copy&paste from the gensim mailing list reply : 为了完整起见,请复制并粘贴gensim邮件列表回复

Seeding with existing documents was a part of gensim some versions back, in the code that was directly ported from LDA-C. 从LDA-C直接移植的代码中,回溯到现有版本的种子是gensim某些版本的一部分。 It was meant to improve convergence (although the final, converged result was the same). 这是为了提高收敛性(尽管最终的收敛结果是相同的)。

In recent versions, that code was replaced by a more efficient algorithm which doesn't use seeding anymore. 在最新版本中,该代码被更高效的算法所取代,该算法不再使用种子。 It is an online (mini-batch) algorithm, so you could say it does "seeding" automatically, in a more principled manner. 它是一种在线(小批量)算法,因此您可以说它以一种更原则的方式自动“播种”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM