简体   繁体   English

用电脑理解手写数字

[英]Understanding Handwritten digit by computer

i would like to ask you one question: wanted to implement a code which clarifies a picture done by hand ( by pen), let us consider such image我想问你一个问题:想要实现一个代码来澄清手工(用笔)完成的图片,让我们考虑这样的图像在此处输入图像描述

it is done by blue pen, which should be converted to the gray scale image using following code它是由蓝笔完成的,应使用以下代码将其转换为灰度图像

from PIL import Image

user_test = filename
col = Image.open(user_test)
gray = col.convert('L')
bw = gray.point(lambda x: 0 if x<100 else 255, '1')
bw.save("bw_image.jpg")
bw
img_array = cv2.imread("bw_image.jpg", cv2.IMREAD_GRAYSCALE)
img_array = cv2.bitwise_not(img_array)
print(img_array.size)
plt.imshow(img_array, cmap = plt.cm.binary)
plt.show()
img_size = 28
new_array = cv2.resize(img_array, (img_size,img_size))
plt.imshow(new_array, cmap = plt.cm.binary)
plt.show()

idea is that i am taking image from camera directly, but it is losing structure of digit and comes only empty and black picture, like this想法是我直接从相机拍摄图像,但它正在失去数字结构并且只有空白和黑色图片,像这样在此处输入图像描述

therefore computer can't understand which digit it is and neural networks fails to predict its label correctly, could you please tell me which transformation should i apply in order to detect this image much more precisely?因此计算机无法理解它是哪个数字,并且神经网络无法正确预测其 label,您能否告诉我应该应用哪种转换才能更准确地检测此图像?

edit : 

i have apply following code我已经申请了以下代码

from PIL import Image

user_test = filename
col = Image.open(user_test)
gray = col.convert('L')
plt.hist(img_array)
plt.show()

and got并得到在此处输入图像描述

You have several issues here, and you can methodically address them.您在这里有几个问题,您可以有条不紊地解决它们。 First of all you're having an issue with thresholding properly.首先,您在正确设置阈值方面遇到问题。

As I suggested in earlier comments, you can easily see why your original thresholding was unsuccessful.正如我在之前的评论中所建议的,您可以很容易地看到为什么您的原始阈值设置不成功。

import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
from matplotlib import cm

im = Image.open('whatever_path_you_choose.jpg').convert("L")
im = np.asarray(im)
plt.hist(im.flatten(), bins=np.arange(255));

Looking at the image you gave:查看您提供的图像:

在此处输入图像描述

Clearly the threshold should be somewhere between 100-200, not as in your original code.显然,阈值应该在 100-200 之间,而不是在您的原始代码中。 Also note that this distribution isn't very bimodal - so I'm not sure otsu's method would work well here.另请注意,这种分布不是很双峰 - 所以我不确定 otsu 的方法在这里是否能很好地工作。

If we eyeball it (this can be tuned), we can see that thresholding at 145-ish gives decent results in terms of segmentation.如果我们观察它(这可以调整),我们可以看到在 145-ish 的阈值在分割方面给出了不错的结果。

im_thresh = (im >= 145)
plt.imshow(im_thresh, cmap=cm.gray)

在此处输入图像描述

Now you might have an additional issue that you have horizontal lines, you can address this by writing on blank paper as suggested.现在您可能有一个额外的问题,即您有水平线,您可以按照建议在白纸上书写来解决这个问题。 This wasn't exactly your question but I will try to address it anyways (in a naive fashion).这不完全是您的问题,但无论如何我都会尝试解决它(以一种天真的方式)。 You can try a naive solution of using a sobel filter (think of it as the derivative of the image to get the lines), followed by a median filter to get the approximately most common pixel intensity - the size of the filter might have to vary for different digits though.您可以尝试使用 sobel 滤波器的简单解决方案(将其视为图像的导数以获取线条),然后使用中值滤波器来获得近似最常见的像素强度 - 滤波器的大小可能必须有所不同但是对于不同的数字。 This should clear up some of the lines.这应该清除一些线。 For a more rigorous approach try reading up on hough line transform for detecting horizontal lines and try to whiten them out.对于更严格的方法,请尝试阅读霍夫线变换以检测水平线并尝试将它们变白。

This is my very naive approach:这是我非常天真的方法:

from skimage.filters import sobel
from scipy.ndimage import median_filter
#Sobel filter reverses intensities so subtracting the result from 1.0 turns it back to the original
plt.imshow(1.0 - median_filter(sobel(im_thresh), [10, 3]), cmap=cm.gray)

在此处输入图像描述

You can try cropping automatically afterwards.之后您可以尝试自动裁剪。 Honestly I think most neural networks that could recognize MNIST-like digits could recognize the result I posted at the end as well.老实说,我认为大多数可以识别类似 MNIST 数字的神经网络也可以识别我最后发布的结果。

Try using skimage package like this.尝试像这样使用 skimage package。 This has inbuilt functions for image processing:这具有用于图像处理的内置功能:

from skimage import io

from skimage.restoration import denoise_tv_chambolle
from skimage.filters import threshold_otsu


image = io.imread('path/to/your/image', as_gray=True)

# Denoising
denoised_image = denoise_tv_chambolle(image, weight=0.1, multichannel=True)

# Thresholding
threshold = threshold_otsu(denoised_image)
thresholded_image = denoised_image > threshold

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 MNIST手写数字 - MNIST handwritten digit 手写数字识别问题:尺寸错误 - Handwritten Digit Recognition Problem: Wrong number of dimensions 没有深度学习技术的手写数字识别 - Handwritten digit recognition without deep learning techniques 简单的(有效的)手写数字识别:如何改进? - Simple (working) handwritten digit recognition: how to improve it? 如何在Python中锐化然后细化手写数字的图像? - How to sharpen and then thin an image of a handwritten digit in Python? 错误的轮廓和错误的output的手写数字识别AI model - wrong contours and wrong output of handwritten digit recognition AI model 用scikit-learn以手写数字示例实现SVM的特征提取器 - feature extractor implementing an SVM with scikit-learn in handwritten digit example 使用Python-OpenCV手写以数字识别和提取 - Handwritten in order digit recognition and extraction with Python-OpenCV 素数分解一个 3000 位数字,最大素数 &lt;=104743(6 位) - 这可以在几分钟内在“普通”计算机上完成吗? - Prime factorize a 3000 digit number, with max prime <=104743 (6 digit) - is this possible to do on a “normal” computer in a few minutes? 了解maplotlib以及如何使用单个数字格式化matplotlib轴? - Understanding maplotlib and how to format matplotlib axis with a single digit?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM