简体   繁体   English

如何在Gensim中使用槌设置主题模型的随机种子?

[英]How can I set random-seed of topic model using mallet in gensim?

I had been trying to keep an output of topic modeling stable by using mallet as a library in gensim. 我一直在尝试通过使用槌子作为gensim中的库来保持主题建模的输出稳定。 However, I found out that mallet can set random-seed but I do not see any parameter in gensim to set it. 但是,我发现木槌可以设置随机种子,但是在gensim中看不到任何参数可以设置它。

This has just been added to the ldamallet.py wrapper 这刚刚被添加到ldamallet.py包装器中

    def __init__(self, mallet_path, corpus=None, num_topics=100, alpha=50, id2word=None, workers=4, prefix=None,
             optimize_interval=0, iterations=1000, topic_threshold=0.0, random_seed=0):
    """

    Parameters
    ----------
    mallet_path : str
        Path to the mallet binary, e.g. `/home/username/mallet-2.0.7/bin/mallet`.
    corpus : iterable of iterable of (int, int), optional
        Collection of texts in BoW format.
    num_topics : int, optional
        Number of topics.
    alpha : int, optional
        Alpha parameter of LDA.
    id2word : :class:`~gensim.corpora.dictionary.Dictionary`, optional
        Mapping between tokens ids and words from corpus, if not specified - will be inferred from `corpus`.
    workers : int, optional
        Number of threads that will be used for training.
    prefix : str, optional
        Prefix for produced temporary files.
    optimize_interval : int, optional
        Optimize hyperparameters every `optimize_interval` iterations
        (sometimes leads to Java exception 0 to switch off hyperparameter optimization).
    iterations : int, optional
        Number of training iterations.
    topic_threshold : float, optional
        Threshold of the probability above which we consider a topic.
    random_seed: int, optional
        Random seed to ensure consistent results, if 0 - use system clock.

    """

I have had the same issue but to use the latest version of gensim , it is a little bit tricky. 我遇到了同样的问题,但是要使用gensim的最新版本,这有点棘手。 As Chris said, the new version has it implemented but running it was troublesome for me. 就像克里斯说的那样,新版本已经实现了,但是运行它对我来说很麻烦。 Make sure to do the following as you might be using the old wrapper: 确保执行以下操作,因为您可能正在使用旧包装器:

  1. conda install -c conda-forge gensim
  2. pip install --upgrade gensim

The second step does the job and just installing it won't update the gensim as I had issue with it. 第二步完成工作,仅安装它不会像我遇到的问题那样更新gensim

The following links have more info for your question: 以下链接为您的问题提供了更多信息:

Gensim Installation Gensim安装

Mallet Wrapper 木槌包装

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM