[英]How sklearn.pipeline works, in manually?
Currently, I am working on the sklearn.pipeline which is just wonderful Here is an example:目前,我正在研究 sklearn.pipeline,这是一个很好的例子:
model = make_pipeline(TfidfVectorizer(), MultinomialNB())
model.fit(train.data, train.target)
labels = model.predict(test.data)
(*data is from train = fetch_20newsgroups(subset='train', categories=categories
)) with categories= ['talk.religion.misc', 'soc.religion.christian', 'sci.space','comp.graphics']
(*数据来自
train = fetch_20newsgroups(subset='train', categories=categories
)) categories= ['talk.religion.misc', 'soc.religion.christian', 'sci.space','comp.graphics']
However, my understanding is just still very vague.但是,我的理解还很模糊。 I would like to ask that if we do it step by step without pipeline how it could be.
我想问一下,如果我们在没有管道的情况下按部就班地进行,那会怎样。 Here is just what I am trying to do but it failed.
这正是我想要做的,但它失败了。
from sklearn.datasets import fetch_20newsgroups
Categories = ['talk.religion.misc', 'soc.religion.christian', 'sci.space','comp.graphics']
train = fetch_20newsgroups(subset='train', categories=categories)`
from sklearn.feature_extraction.text import TfidfVectorizer
model1=TfidfVectorizer()
X=model1.fit_transform(train.data)
from sklearn.naive_bayes import MultinomialNB
model2=MultinomialNB
model2.fit(....)
At this far, I just don't know what to do next because the shape of X
is not suitable for model2
.到目前为止,我只是不知道下一步该怎么做,因为
X
的形状不适合model2
。
For your further information of this, go to the book from this link at page (406/548)有关这方面的更多信息,请从第 (406/548) 页的此链接转到该书
*** Please pardon for my silly question. ***请原谅我的愚蠢问题。 I know I can do it by using pipeline but just want to try
我知道我可以通过使用管道来做到这一点,但只是想尝试一下
You are almost there!你快到了! you need to use
MultinomialNB()
instead of MultinomialNB
.您需要使用
MultinomialNB()
而不是MultinomialNB
。
Try the following procedure.请尝试以下过程。
from sklearn.datasets import fetch_20newsgroups
Categories = ['talk.religion.misc', 'soc.religion.christian', 'sci.space','comp.graphics']
train = fetch_20newsgroups(subset='train', categories=categories)
from sklearn.feature_extraction.text import TfidfVectorizer
model1=TfidfVectorizer()
X=model1.fit_transform(train.data)
from sklearn.naive_bayes import MultinomialNB
model2=MultinomialNB()
model2.fit(X, train.target)
model2.predict(model1.transform(test.data))
# array([2, 1, 1, ..., 2, 1, 1])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.