Averaging Vectors from Documents

Question

How could I use the medium-sized spaCy model en_core_web_md to parse through a folder of documents to get individual vectors from each single word document, and then average them together?

import spacy
nlp = spacy.load("en_core_web_md")

Answer 1

First you have to load all the docs into the list using python file io/op.

#documents loaded into the python list.
documents_list = ['Hello, world','Here are two sentences.']
#Iterate over each document and initiate nlp instance.
for doc in documents_list:
    doc_nlp = nlp(doc)
    #this gives the average vector of each document.
    print(doc_nlp.vector)
    for token in doc_nlp:
        #this gives the text of each word in the doc and their vector.
        print(token.text,token.vector)

Let me know if you need any clarification.

Averaging Vectors from Documents

Question

1 answers

solution1
1 2018-03-23 06:35:42

Averaging Vectors from Documents

Question

1 answers

solution1 1 2018-03-23 06:35:42

solution1
1 2018-03-23 06:35:42