简体   繁体   中英

Averaging Vectors from Documents

How could I use the medium-sized spaCy model en_core_web_md to parse through a folder of documents to get individual vectors from each single word document, and then average them together?

import spacy
nlp = spacy.load("en_core_web_md")

First you have to load all the docs into the list using python file io/op.

#documents loaded into the python list.
documents_list = ['Hello, world','Here are two sentences.']
#Iterate over each document and initiate nlp instance.
for doc in documents_list:
    doc_nlp = nlp(doc)
    #this gives the average vector of each document.
    print(doc_nlp.vector)
    for token in doc_nlp:
        #this gives the text of each word in the doc and their vector.
        print(token.text,token.vector)

Let me know if you need any clarification.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM