How to get most frequent context words from pretrained fasttext model?
For example: For word 'football' and corpus ["I like playing football with my friends"]
Get list of context words: ['playing', 'with','my','like']
I try to use model_wiki = gensim.models.KeyedVectors.load_word2vec_format("wiki.ru.vec") model.most_similar("блок")
But it's not satisfied for me
The plain model doesn't retain any such co-occurrence statistics from the original corpus. It just has the trained results: vectors per word.
So, the ranked list of most_similar()
vectors – which isn't exactly words that appeared-together, but strongly correlates to that – is the best you'll get from that file.
Only going back to the original training corpus would give you exactly what you've requested.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.