简体   繁体   中英

SpaCy TextCategorizer Pipeline detailed

I'm currently working on NLP project. Actually, when i researched how to deal with NLP, i found some articles about SpaCy. But, because i'm still newbie on python, i don't understand how SpaCy TextCategorizer Pipeline works.

Is there any detailed about how this pipeline works? Is TextCategorizer Pipeline also using text feature extraction such as Bag of Words, TF-IDF, Word2Vec or anything else? And what model architecture use in SpaCy TextCategorizer? Is there someone who could explain me about this?

There's a lot of info in the docs:

The model supports classification with multiple, non-mutually exclusive labels. You can change the model architecture rather easily, but by default, the TextCategorizer class uses a convolutional neural network to assign position-sensitive vectors to each word in the document. The TextCategorizer uses its own CNN model, to avoid sharing weights with the other pipeline components. The document tensor is then summarized by concatenating max and mean pooling, and a multilayer perceptron is used to predict an output vector of length nr_class, before a logistic activation is applied elementwise. The value of each output neuron is the probability that some class is present.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM