Upload a pre-trained spanish language word vectors and then retrain it with custom sentences? (GENSIM -FASTTEXT)

Question

I am trying to upload a pre-trained spanish language word vectors and then retrain it with custom sentences:

!pip install fasttext
import fasttext
import fasttext.util
#download pre-trained spanish language word vectors c
fasttext.util.download_model('es', if_exists='ignore')  # Spanish
ft = fasttext.load_model('cc.es.300.bin')

but once I try to update the vocabulary it gives me this AttributeError:

ft.build_vocab(sentences, update=True)
AttributeError: '_FastText' object has no attribute 'build_vocab'

Any advices?

Answer 1

The build_vocab() method supports a step in the Gensim library implementation of the FastText algorithm - not the original fastttext package from Facebook that you seem to be loading. (You're mixing code meant for two different libraries.)

If you switch to using Gensim code, rather than Facebook's implementation, you won't get that same error when trying to use build_vocab() .

Note, though, that what you're attempting, incremental retraining of an existing model, is an advanced/experimental technique that can easily backfire. So it's usually a bad idea to attempt without expertise & rigorous checks as to whether the extra complications are helping.

Upload a pre-trained spanish language word vectors and then retrain it with custom sentences? (GENSIM -FASTTEXT)

Question

1 answers

solution1
0 2021-12-13 15:39:44

Upload a pre-trained spanish language word vectors and then retrain it with custom sentences? (GENSIM -FASTTEXT)

Question

1 answers

solution1 0 2021-12-13 15:39:44

solution1
0 2021-12-13 15:39:44