简体   繁体   中英

Scikit-learn DictVectorizer to Classifier

I am trying to load a dictionary, and then perform classification. However, I get the error:

  File "train_classifier.py", line 49, in <module>
    clf.fit(page_vecs.data[:-1],page_vecs.target[:-1])
  File "/usr/local/lib/python3.4/site-packages/scipy/sparse/base.py", line 505, in __getattr__
    raise AttributeError(attr + " not found")
AttributeError: target not found

How can I load the targets? Here is my code:

vec = DictVectorizer()
page_vecs = vec.fit_transform(feature_dict_list)
clf = svm.SVC(gamma=0.001, C=100)
clf.fit(page_vecs.data[:-1],page_vecs.target[:-1])
print(clf.predict(page_vecs[-1]))

Look at the DictVectorizer class, specifically its fit_transform method:

Returns:
Xa : {array, sparse matrix}

Feature vectors; always 2-d.

So it returns a 2d array.

In your code, this line:

page_vecs = vec.fit_transform(feature_dict_list)

Will cause page_vecs to be such a 2d array. 2d numpy arrays have no target attribute, which you try to use here:

clf.fit(page_vecs.data[:-1],page_vecs.target[:-1])

That is why you get the error. In fact, you shouldn't even do .data , you should work with the numpy array directly. If you want to ignore the last row, do:

page_vecs[:-1, :]

Your labels (or targets) have nothing to do with the DictVectorizer class, which only vectorizes your samples, not your labels. You should have a separate vector for the labels.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM