简体   繁体   English

简单的(有效的)手写数字识别:如何改进?

[英]Simple (working) handwritten digit recognition: how to improve it?

I just wrote this very simple handwritten digit recoginition. 我只是写了这个非常简单的手写数字识别。 Here is 8kb archive with the following code + ten .PNG image files. 这是8kb存档,其中包含以下代码+十个.PNG图像文件。 It works: 有用: 在此处输入图片说明 is well recognized as 被公认为 在此处输入图片说明 .

In short, each digit of the database (50x50 pixels = 250 coefficients) is summarized into a 10-coefficient-vector (by keeping the 10 biggest singular values, see Low-rank approximation with SVD ). 简而言之,数据库的每个数字(50x50像素= 250个系数)被汇总为一个10系数向量(通过保留10个最大的奇异值,请参见SVD的低秩近似 )。

Then for the digit to be recognized, we minimize the distance with the digits in the database. 然后为了识别数字,我们将距离与数据库中的数字最小化。

from scipy import misc
import numpy as np
import matplotlib.pyplot as plt

digits = []
for i in range(11):
    M = misc.imread(str(i) + '.png', flatten=True)
    U, s, V = np.linalg.svd(M, full_matrices=False)
    s[10:] = 0        # keep the 10 biggest singular values only, discard others
    S = np.diag(s)
    M_reduced = np.dot(U, np.dot(S, V))      # reconstitution of image with 10 biggest singular values
    digits.append({'original': M, 'singular': s[:10], 'reduced': M_reduced})

# each 50x50 pixels digit is summarized into a vector of 10 coefficients : the 10 biggest singular values s[:10]    

# 0.png to 9.png = all the digits (for machine training)
# 10.png = the digit to be recognized
toberecognizeddigit = digits[10]    
digits = digits[:10]

# we find the nearest-neighbour by minimizing the distance between singular values of toberecoginzeddigit and all the digits in database
recognizeddigit = min(digits[:10], key=lambda d: sum((d['singular']-toberecognizeddigit['singular'])**2))    

plt.imshow(toberecognizeddigit['reduced'], interpolation='nearest', cmap=plt.cm.Greys_r)
plt.show()
plt.imshow(recognizeddigit['reduced'], interpolation='nearest', cmap=plt.cm.Greys_r)
plt.show()

Question: 题:

The code works (you can run the code in the ZIP archive), but how can we improve it to have better results? 该代码有效(您可以在ZIP存档中运行该代码),但是我们如何才能对其进行改进以获得更好的结果? (mostly math techniques I imagine). (我想象中的大多数是数学技术)。

For example in my tests, 9 and 3 are sometimes confused with each other. 例如,在我的测试中,9和3有时会相互混淆。

Digit recognition can be a quite difficult area. 数字识别可能是一个相当困难的领域。 Especially when the digits are written in very different or unclear ways. 尤其是当数字以完全不同或不清楚的方式书写时。 A lot of approaches have been taken in an attempt to solve this problem, and entire competions are dedicated to this subject. 为了解决这个问题,已经采取了许多方法,并且整个竞争都致力于这一主题。 For an example, see Kaggle's digit recognizer competition . 有关示例,请参见Kaggle的数字识别器竞赛 This competition is based on the well known MNIST data set . 这项竞赛基于众所周知的MNIST数据集 In the forums that are there, you will find a lot of ideas and approaches to this problem, but I will give some quick suggestions. 在那里的论坛中,您会找到很多解决此问题的想法和方法,但是我会给出一些快速的建议。

A lot of people approach this problem as a classification problem. 许多人将此问题视为分类问题。 Possible algorithms to solve such problems include, for example, kNN, neural networks, or gradient boosting. 解决此类问题的可能算法包括,例如kNN,神经网络或梯度提升。

However, generally just the algorithm is not enough to get optimal classification rates. 然而,通常仅算法不足以获得最佳分类率。 Another important aspect to improve your scores is feature extraction. 改善分数的另一个重要方面是特征提取。 The idea is to calculate features that make it possible to distinguish between different numbers. 想法是计算可以区分不同数字的特征。 Some example features for this dataset might include the number of colored pixels, or maybe the width and the height of the digits. 该数据集的一些示例功能可能包括彩色像素的数量,或者数字的宽度和高度。

Although the other algorithms might not be what you are looking for, it is possible that adding more features can improve the performance of the algorithm you are currently using as well. 尽管其他算法可能不是您要找的算​​法,但是添加更多功能可能也可以改善当前使用的算法的性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM