非线性决策边界的 SVM 图

Question

I am trying to plot SVM decision boundary which separates two classes, cancerous and non-cancerous.我正在尝试绘制 SVM 决策边界，该边界将癌性和非癌性两类分开。 However, it's displaying a plot which is far from what I wanted.但是，它显示的情节与我想要的相去甚远。 I wanted it to look like this:我希望它看起来像这样：

or anything that shows the points are scattered.或任何显示点分散的东西。 Here's my code:这是我的代码：

import numpy as np
import pandas as pd
from sklearn import svm
from mlxtend.plotting import plot_decision_regions
import matplotlib.pyplot as plt

autism = pd.read_csv('predictions.csv')


# Fit Support Vector Machine Classifier
X = autism[['TARGET','Predictions']]
y = autism['Predictions']

clf = svm.SVC(C=1.0, kernel='rbf', gamma=0.8)
clf.fit(X.values, y.values) 

# Plot Decision Region using mlxtend's awesome plotting function
plot_decision_regions(X=X.values, 
                      y=y.values,
                      clf=clf, 
                      legend=2)

# Update plot object with X/Y axis labels and Figure Title
plt.xlabel(X.columns[0], size=14)
plt.ylabel(X.columns[1], size=14)
plt.title('SVM Decision Region Boundary', size=16)
plt.show()

But I got a weird looking plot:但我有一个奇怪的情节：

You can find the csv file here predictions.csv您可以在此处找到 csv 文件predictions.csv

Answer 1

You sound a little confused...你听起来有点糊涂...

Your predictions.csv looks like:您的predictions.csv .csv 看起来像：

TARGET  Predictions
     1  0
     0  0
     0  0
     0  0

and, as I guess the column names imply, it contains the ground truth ( TARGET ) and the Predictions of some (?) model already run.而且，正如我猜列名所暗示的那样，它包含基本事实（ TARGET ）和一些（？）模型的Predictions已经运行。

Given that, what you are doing in your posted code makes absolutely no sense at all: you are using both these columns as features in your X in order to predict your y , which is... exactly one of these same columns ( Predictions ), already contained in your X ...鉴于此，您在发布的代码中所做的事情完全没有意义：您将这两个列用作X中的特征以预测您的y ，这正是这些相同列中的一个（ Predictions ） , 已经包含在你的X ...

Your plot looks "strange" simply because what you have plotted are not your data points, and the X and y data you show here are not the data that should be used for fitting your classifier.您的绘图看起来“奇怪”，仅仅是因为您绘制的不是您的数据点，并且您在此处显示的X和y数据不是应该用于拟合分类器的数据。

I am further puzzled because, in your linked repo, you have indeed the correct procedure in your script:我进一步感到困惑，因为在您的链接存储库中，您的脚本中确实有正确的程序：

autism = pd.read_csv('10-features-uns.csv')

x = autism.drop(['TARGET'], axis = 1)  
y = autism['TARGET']
x_train, X_test, y_train, y_test = train_test_split(x, y, test_size = 0.30, random_state=1)

ie reading your features and labels from 10-features-uns.csv , and certainly not from predictions.csv , as you are inexplicably trying to do here...即从10-features-uns.csv读取你的特征和标签，当然不是从predictions.csv ，因为你在这里莫名其妙地试图做......

非线性决策边界的 SVM 图

问题描述

1 个解决方案

解决方案1
2 已采纳 2019-04-20 13:32:37

非线性决策边界的 SVM 图

问题描述

1 个解决方案

解决方案1 2 已采纳 2019-04-20 13:32:37

解决方案1
2 已采纳 2019-04-20 13:32:37