I am trying to fine tunning for my problem a FastText pretrained model using gensim wrapper but I am having problems. I load the model embeddings successufully from the .bin file like this:
from gensim.models.fasttext import FastText
model=FastText.load_fasttext_format(r_bin)
Nevertheless, I am struggling when I want to retrain the model using this 3 lines of code:
sent = [['i', 'am ', 'interested', 'on', 'SPGB'], ['SPGB' 'is', 'a', 'good', 'choice']]
model.build_vocab(sent, update=True)
model.train(sentences=sent, total_examples = len(sent), epochs=5)
I get this error over and over no matter what do I change:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-91-6456730b1919> in <module>
1 sent = [['i', 'am', 'interested', 'on', 'SPGB'], ['SPGB' 'is', 'a', 'good', 'choice']]
----> 2 model.build_vocab(sent, update=True)
3 model.train(sentences=sent, total_examples = len(sent), epochs=5)
/opt/.../fasttext.py in build_vocab(self, sentences, update, progress_per, keep_raw_vocab, trim_rule, **kwargs)
380 return super(FastText, self).build_vocab(
381 sentences, update=update, progress_per=progress_per,
--> 382 keep_raw_vocab=keep_raw_vocab, trim_rule=trim_rule, **kwargs)
383
384 def _set_train_params(self, **kwargs):
/opt/.../base_any2vec.py in build_vocab(self, sentences, update, progress_per, keep_raw_vocab, trim_rule, **kwargs)
484 trim_rule=trim_rule, **kwargs)
485 report_values['memory'] = self.estimate_memory(vocab_size=report_values['num_retained_words'])
--> 486 self.trainables.prepare_weights(self.hs, self.negative, self.wv, update=update, vocabulary=self.vocabulary)
487
488 def build_vocab_from_freq(self, word_freq, keep_raw_vocab=False, corpus_count=None, trim_rule=None, update=False):
/opt/.../fasttext.py in prepare_weights(self, hs, negative, wv, update, vocabulary)
752
753 def prepare_weights(self, hs, negative, wv, update=False, vocabulary=None):
--> 754 super(FastTextTrainables, self).prepare_weights(hs, negative, wv, update=update, vocabulary=vocabulary)
755 self.init_ngrams_weights(wv, update=update, vocabulary=vocabulary)
756
/opt/.../word2vec.py in prepare_weights(self, hs, negative, wv, update, vocabulary)
1402 self.reset_weights(hs, negative, wv)
1403 else:
-> 1404 self.update_weights(hs, negative, wv)
1405
1406 def seeded_vector(self, seed_string, vector_size):
/opt/.../word2vec.py in update_weights(self, hs, negative, wv)
1452 self.syn1 = vstack([self.syn1, zeros((gained_vocab, self.layer1_size), dtype=REAL)])
1453 if negative:
-> 1454 self.syn1neg = vstack([self.syn1neg, zeros((gained_vocab, self.layer1_size), dtype=REAL)])
1455 wv.vectors_norm = None
1456
AttributeError: 'FastTextTrainables' object has no attribute 'syn1neg'
Thanks for your help in advance
Thanks for the detailed code showing what you've tried & what error you hit.
Are you sure you're using the latest Gensim release, gensim-3.8.3
? I can't reproduce the error using your code, with that Gensim.
Also: in gensim-3.8.3
you would be seeing a warning to the effect:
DeprecationWarning: Call to deprecated 'load_fasttext_format' (use load_facebook_vectors (to use pretrained embeddings) or load_facebook_model (to continue training with the loaded full model, more RAM) instead).
(The deprecated method will just call load_facebook_model()
for you, so using the older method wouldn't alone cause your issue – but your environment should be using the latest Gensim, and your code should be upated to call the preferred method.)
Note further:
As there are no new words in your tiny test text, the build_vocab(..., update=True)
isn't strictly necessary nor doing anything relevant. The known-vocabulary of your model is the same before & after. (Of course, if actual new sentences with new words was used, that'd be different – but your tiny example isn't yet truly testing vocabulary-expansion.)
And further:
This style of training some new data, or small number of new words, into an existing model is fraught with difficult tradeoffs.
In particular, to the extent your new data only includes your new words and some subset of the original model's words, only those new-data words will be receiving training updates, based on their new usages. This gradually pulls all words in your new training data to new positions. These new positions may become optimal for the new texts, but could be far – perhaps very far – from their old positions, where they were originally trained in the early model.
Thus, neither your new words nor the old-words-that-have-received-new-trainined will remain inherently comparable to any of the old words that weren't in your new data. Essentially, only words that train together are necessarily moved to usefully-contrasting positions.
So if your new data is large & varied enough to cover words needed for your application, training an all-new model may be both simpler and better. On the other hand, if your new data is thin, training just that tiny sliver of words/examples into the old model still risks pulling that sliver of words out of useful 'alignment' with older words.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.