简体   繁体   English

Classifier.predict 在 Python

[英]Classifier.predict In Python

I have this code in Python:我在 Python 中有这个代码:

def plot_decision_regions(X, y, classifier, resolution = 0.02):


    markers = ('s', 'x', 'o', '^','v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])


    x1_min, x1_max = X[:, 0].min() -1, X[:,0].max() + 1
    x2_min, x2_max = X[:, 1].min() -1, X[:,1].max() + 1
   xx1, xx2= np.meshgrid (np.arange(x1_min, x1_max, resolution), np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    plt.contourf(xx1, xx2, Z, alpha= 0.3, cmap = cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())


for idx, cl in enumerate (np.unique(y)):
    plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')

Where X is a 100x2 vector with normal data (sepal and petla length for 2 kinds of flowers), y is a 100x1 vector with only -1 and 1 values (class label vector) and Classifier = Perceptron .其中X是一个 100x2 向量,具有正常数据(两种花的萼片和花瓣长度), y是一个 100x1 向量,只有 -1 和 1 个值(类 label 向量)和Classifier = Perceptron I don't know why I need to calculate the transpose我不知道为什么我需要计算转置

Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)

What does做什么

classifier.predict 

and

x=X[y == cl, 0], y= X[y == cl, 1]

in

plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')

do?做?

I previously load a dataframe, define my predict method, define X and y我之前加载了一个dataframe,定义我的预测方法,定义Xy

def predict(self,X):
    '''Return class label after unit step'''
    return np.where(self.net_input(X) >= 0.0, 1, -1)

And my class = Perceptron contains the w_ (weights) that are adjusted when iterating.我的class = Perceptron包含迭代时调整的w_ (权重)。 Sorry if my english is not perfect对不起,如果我的英语不完美

y = df.iloc[0:100 , 4] .values
y= np.where (y == 'Iris-setosa', -1, 1)

X= df.iloc[0:100, [0,2]].values

Let's break this down, first:让我们先分解一下:

np.array([xx1.ravel(), xx2.ravel()])

.ravel() flattens the xx1 and xx2 arrays. .ravel()使xx1xx2 arrays 变平。 xx1 and xx2 are just coordinates (for feature1 and feature2 respectively) arranged in a grid pattern. xx1xx2只是以网格模式排列的坐标(分别用于 feature1 和 feature2)。 Idea is that xx1 and xx2 are coordinates at every resolution interval in the range of the feature-set.想法是xx1xx2是特征集范围内每个resolution间隔的坐标。 With enough of these coordinates, you can effectively know what regions are classified as what label by your classifier.有了足够多的这些坐标,您就可以有效地知道哪些区域被您的分类器分类为 label。

np.array([xx1.ravel(), xx2.ravel()]).T

The reason you need the transpose is because the .predict() method expects as input an array of size [n_samples, n_features] .您需要转置的原因是.predict()方法需要一个大小为[n_samples, n_features]的数组作为输入。 The result of the ravelled array will be of size [n_features, n_samples] , which is why we need to transpose.散列数组的结果将是大小[n_features, n_samples] ,这就是我们需要转置的原因。

classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T

This makes the predictions for each of the meshgrid points (which is then used to make a mask over the plot to show which regions are classified as what label by the classifier).这将对每个网格点进行预测(然后用于在 plot 上制作掩码,以显示哪些区域被分类器分类为 label)。

plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], marker= markers [idx], label = cl, edgecolor = 'black')

Here, we plot our samples.在这里,我们 plot 我们的样品。 We want to plot each class of samples seperately (in order to have them be different colors), so x=X[y == cl, 0] and y= X[y == cl, 1] are saying only plot the point at points where the label is equal to the current one we are inspecting (ie cl ). We want to plot each class of samples seperately (in order to have them be different colors), so x=X[y == cl, 0] and y= X[y == cl, 1] are saying only plot the point在 label 等于我们正在检查的当前(即cl )的点。 cl will just be an iteration of all the unique possible labels. cl将只是所有唯一可能标签的迭代。

It's easier to understand once you see what the result looks like (here's an example using a make_blobs dataset and an MLPClassifier :一旦你看到结果是什么样子就更容易理解了(这是一个使用make_blobs数据集和MLPClassifier的示例:

import numpy as np
import matplotlib.pyplot as plt

from matplotlib.colors import ListedColormap
from sklearn.datasets import make_blobs
from sklearn.neural_network import MLPClassifier

def plot_decision_regions(X, y, classifier, resolution = 0.02):
    markers = ('s', 'x', 'o', '^','v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])

    x1_min, x1_max = X[:, 0].min() -1, X[:,0].max() + 1
    x2_min, x2_max = X[:, 1].min() -1, X[:,1].max() + 1
    xx1, xx2= np.meshgrid (np.arange(x1_min, x1_max, resolution), np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    plt.contourf(xx1, xx2, Z, alpha= 0.3, cmap = cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())

colors = ['red', 'blue', 'green']
X, y = make_blobs(n_features=2, centers=3)

for idx, cl in enumerate (np.unique(y)):
    plt.scatter (x=X[y == cl, 0], y= X[y == cl, 1], alpha=0.8, c=colors[idx], label = cl, edgecolor = 'black')

classifier = MLPClassifier()
classifier.fit(X, y)

plot_decision_regions(X, y, classifier, resolution = 0.02)

You get:你得到: 在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM