[英]How to use Word2Vec's vocab of one model into another?
I have a Doc2Vec's
model and I want to create Word2vec's
model with different dimension. 我有一个
Doc2Vec's
模型,我想创建具有不同尺寸的Word2vec's
模型。 How can I use Doc2Vec's model vocab
for fast training? 如何使用Doc2Vec的模型
vocab
进行快速培训? Or is it feasible
to train like this? 还是这样训练是
feasible
? Does vocab building
has any effect on train
? vocab building
对train
有影响吗?
vocab building
is essentially just one pass over the entire dataset and doesn't impact the training time much (unless you are training over billions of words). vocab building
本质上只是整个数据集的一遍,不会对培训时间产生太大影响(除非您要培训数十亿个单词)。
Gensim's Doc2Vec (to the best of my knowledge) doesn't currently allow creating models from pre-defined vocabulary. Gensim的Doc2Vec(据我所知)目前不允许使用预定义的词汇表创建模型。 If you are using Mikolov's code for sentence2vec ( https://groups.google.com/d/msg/word2vec-toolkit/Q49FIrNOQRo/J6KG8mUj45sJ ), it will allow you to save vocab and read from vocab.
如果您对句子2vec( https://groups.google.com/d/msg/word2vec-toolkit/Q49FIrNOQRo/J6KG8mUj45sJ )使用Mikolov的代码,则可以保存vocab并从vocab中读取。
word2vec -save-vocab <file>
word2vec -read-vocab <file>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.