简体繁体中英

How to load large dataset to gensim word2vec model

原文 2020-08-17 22:55:40 1 1 python/ iterator/ gensim/ word2vec

So I have multiple text files(around 40). and each file has around 2000 articles (average of 500 words each). And each document is a single line in the text file.

So because of the memory limitations I wanted to use dynamic loading of these text files for training. (Perhaps a iterator class?)

so how do I proceed?

train each text file -> save the model -> load the model and rerun on new data?
is there a way with iterator class to do this automatically?
should I give sentence by sentence, article by article or text file by text file as input to model training?

1 answers

So I have multiple text files(around 40). and each file has around 2000 articles (average of 500 words each). And each document is a single line in the text file.

So because of the memory limitations I wanted to use dynamic loading of these text files for training. (Perhaps a iterator class?)

so how do I proceed?

train each text file -> save the model -> load the model and rerun on new data?
is there a way with iterator class to do this automatically?
should I give sentence by sentence, article by article or text file by text file as input to model training?

Can't load saved gensim word2vec model

How to remove a word completely from a Word2Vec model in gensim?

Gensim word2vec and large amount of texts

How can a Word2Vec pretrained model be loaded in Gensim faster?

How to embed user names in word2vec model in gensim

How to train Word2Vec model on Wikipedia page using gensim?

Gensim unable to load word2vec models

Incremental Word2Vec Model Training in gensim

Gensim Word2Vec model floating point

Gensim Word2Vec model: Cut dimensions

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Can't load saved gensim word2vec model How to remove a word completely from a Word2Vec model in gensim? Gensim word2vec and large amount of texts How can a Word2Vec pretrained model be loaded in Gensim faster? How to embed user names in word2vec model in gensim How to train Word2Vec model on Wikipedia page using gensim? Gensim unable to load word2vec models Incremental Word2Vec Model Training in gensim Gensim Word2Vec model floating point Gensim Word2Vec model: Cut dimensions

Related Tags

How to load large dataset to gensim word2vec model

Question

1 answers

solution1 0 2020-08-17 23:15:24

solution1
0 2020-08-17 23:15:24