[英]How to initialize LDA with a set of seed words using gensim package
I have read several papers in which they actually initialize parameteres using a seed set of words for LDA. 我已经读了几篇论文,其中他们实际上使用LDA的一组单词来初始化参数。 does anyone know how is this possible in gensim package?
有谁知道gensim包中怎么可能?
For the sake of completeness, copy&paste from the gensim mailing list reply : 为了完整起见,请复制并粘贴gensim邮件列表回复 :
Seeding with existing documents was a part of gensim some versions back, in the code that was directly ported from LDA-C.
从LDA-C直接移植的代码中,回溯到现有版本的种子是gensim某些版本的一部分。 It was meant to improve convergence (although the final, converged result was the same).
这是为了提高收敛性(尽管最终的收敛结果是相同的)。
In recent versions, that code was replaced by a more efficient algorithm which doesn't use seeding anymore.
在最新版本中,该代码被更高效的算法所取代,该算法不再使用种子。 It is an online (mini-batch) algorithm, so you could say it does "seeding" automatically, in a more principled manner.
它是一种在线(小批量)算法,因此您可以说它以一种更原则的方式自动“播种”。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.