简体   繁体   English

我在deeplearning4j中使用了word2vec来训练单词向量,但是这些向量是不稳定的

[英]I used word2vec in deeplearning4j to train word vectors, but those vectors are unstable

1.I use IntelliJ IDEA build a maven project,code is as follows: 1.我使用IntelliJ IDEA构建了一个maven项目,代码如下:

    System.out.println("Load data....");
    SentenceIterator iter = new LineSentenceIterator(new File("/home/zs/programs/deeplearning4j-master/dl4j-test-resources/src/main/resources/raw_sentences.txt"));
    iter.setPreProcessor(new SentencePreProcessor() {
        @Override

            return sentence.toLowerCase();
        }
    });
    System.out.println("Build model....");
    int batchSize = 1000;
    int iterations = 30;
    int layerSize = 300;
    com.sari.Word2Vec vec= new  com.sari.Word2Vec.Builder()
            .batchSize(batchSize) //# words per minibatch.
            .sampling(1e-5) // negative sampling. drops words out
            .minWordFrequency(5) //
            .useAdaGrad(false) //
            .layerSize(layerSize) // word feature vector size
            .iterations(iterations) // # iterations to train
            .learningRate(0.025) //
            .minLearningRate(1e-2) // learning rate decays wrt # words. floor learning
            .negativeSample(10) // sample size 10 words
            .iterate(iter) //
            .tokenizerFactory(tokenizer)
            .build();
    vec.fit();
    System.out.println("Evaluate model....");
    double cosSim = vec.similarity("day" , "night");
    System.out.println("Similarity between day and night: "+cosSim);

This code is reference the word2vec in deeplearning4j,but the result is unstable.The results of each experiment were very different.for example, with the cosine value of the similarity between 'day'and 'night', sometimes the result is as high as 0.98, sometimes as low as 0.4? 该代码在deeplearning4j中引用了单词2vec,但结果不稳定。每个实验的结果都非常不同。例如,对于“ day”和“ night”之间相似度的余弦值,有时结果高达0.98,有时低至0.4?

Here are the results of two experiments 这是两个实验的结果

Evaluate model....
Similarity between day and night: 0.706292986869812

Evaluate model....
Similarity between day and night: 0.5550910234451294

Why the result like this.Because I have just started learning word2vec, there are a lot of knowledge is not understood, I hope that seniors can help me,thanks! 为什么这样的结果。由于我刚开始学习word2vec,有很多知识还不懂,希望老年人可以帮助我,谢谢!

You have set the following line: 您已设置以下行:

.minLearningRate(1e-2) // learning rate decays wrt # words. floor learning

But that is an extremely high learning rate. 但这是极高的学习率。 The high learning rate causes the model to not 'settle' in any state, but instead a few updates significantly changes the learned representation. 高学习率导致模型在任何状态下都不会“稳定”,而是进行一些更新会显着改变学习的表示形式。 That is not a problem during the first few updates, but bad for convergence. 在最初的几次更新中这不是问题,但不利于收敛。

Solution: Allow learning rate to decay. 解决方案:让学习率下降。 You can leave this line out completely, or if you must you can use a more appropriate value, such as 1e-15 您可以完全省略此行,或者如果必须,可以使用更合适的值,例如1e-15

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM