简体   繁体   中英

Predicting numbers using sklearn digits dataset - wrong predictions

I want to build a simple digit prediction model.

Therefore I:

  1. load in the sklearn dataset
  2. Expand the figure sizes from 8*8 to 32*32
  3. Teach a SVM using the sklearn digits
  4. Predict the new image.

--> The model returns 8 or 1 for most of the test images. Do I have a mistike in the code?

The image is the following:

在此处输入图片说明

The code I use is:

def predictimage(file):

import matplotlib.pyplot as plt
from skimage import transform
from PIL import Image
import pandas as pd
import numpy as np
from sklearn import svm
from sklearn import datasets
import PIL.ImageOps


#Load in the query instance


img= Image.open(file)
img=img.convert("L")
img=PIL.ImageOps.invert(img)
img=img.resize((32,32),Image.ANTIALIAS)
imgplot=plt.imshow(img)


query=np.array(img).flatten()
query=(query/16).round()



#Plot query digit
plt.imshow(query.reshape((32,32)))

在此处输入图片说明

#Load in the training dataset

digits=datasets.load_digits()
features=digits.data
targets=digits.target





#Expand 8*8 image to a 32*32 image (64 to 1024)
newfeatures=[transform.resize(features[i].reshape(8,8),(32,32))for i in range(len(features))]
newfeatures=np.array(newfeatures).reshape((1797,1024)).round()

#Plot expanded image with 32*32 pixels
for l in range(9):
    ax[1+l].imshow(newfeatures[100+l].reshape((32,32)).round())



#Instantiate, Train and predict    
clf=svm.SVC(gamma=0.001,C=100)
clf.fit(newfeatures,targets)

prediction=clf.predict(query)

plt.show()
return prediction



predictimage(r"C:\...\digit.jpg")

array([8])

You need to indent your code:

from matplotlib import pyplot as plt 
from skimage import transform
from PIL import Image
import pandas as pd
import numpy as np
from sklearn import svm
from sklearn import datasets
import PIL.ImageOps

def predictimage(file):

    #Load in the query instance
    img = Image.open(file)
    img =img.convert("L")
    img =PIL.ImageOps.invert(img)
    img =img.resize((32,32),Image.ANTIALIAS)
    imgplot =plt.imshow(img)


    query=np.array(img).flatten()
    query=(query/16).round()



    #Plot query digit
    plt.imshow(query.reshape((32,32)))

    #Load in the training dataset

    digits=datasets.load_digits()
    features=digits.data
    targets=digits.target





    #Expand 8*8 image to a 32*32 image (64 to 1024)
    newfeatures=[transform.resize(features[i].reshape(8,8),(32,32))for i in range(len(features))]
    newfeatures=np.array(newfeatures).reshape((1797,1024)).round()

    #Plot expanded image with 32*32 pixels
    for l in range(9):
        ax[1+l].imshow(newfeatures[100+l].reshape((32,32)).round())



    #Instantiate, Train and predict    
    clf=svm.SVC(gamma=0.001,C=100)
    clf.fit(newfeatures,targets)

    prediction=clf.predict(query)

    plt.show()
    return prediction

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM