简体   繁体   中英

How extract vocabulary vectors from gensim's word2vec?

I want to analyze the vectors looking for patterns and stuff, and use SVM on them to complete a classification task between class A and B, the task should be supervised. (I know it may sound odd but it's our homework.) so as a result I really need to know:

1- how to extract the coded vectors of a document using a trained model?

2- how to interpret them and how does word2vec code them?

I'm using gensim's word2vec.

  1. If you have trained word2vec model, you can get word-vector by __getitem__ method

    model = gensim.models.Word2Vec(sentences) print(model["some_word_from_dictionary"])

  2. Unfortunately, embeddings from word2vec/doc2vec not interpreted by a person (in contrast to topic vectors from LdaModel)

P/S If you have texts at the object in your tasks, then you should use Doc2Vec model

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM