Numpy PIL Python：在空格上裁剪图像或使用直方图阈值裁剪文本

Question

How would I go about finding the bounding box or window for the region of whitespace surrounding the numbers in the image below?: 我如何找到下图中围绕数字的空白区域的边界框或窗口？

Original image: 原始图片：

在此处输入图片说明

Height: 762 pixels Width: 1014 pixels 高度：762像素宽度：1014像素

Goal: 目标：

Something like: {x-bound:[x-upper,x-lower], y-bound:[y-upper,y-lower]} so I can crop to the text and input into tesseract or some OCR. 像这样的东西： {x-bound:[x-upper,x-lower], y-bound:[y-upper,y-lower]}因此我可以裁剪文本并输入到tesseract或一些OCR中。

Attempts: 尝试：

I had thought of slicing the image into hard coded chunk sizes and analysing at random, but i think it would be too slow. 我曾考虑过将图像切成硬编码的块大小并随机分析，但是我认为这太慢了。

Example code using pyplot adapted from ( Using python and PIL how can I grab a block of text in an image? ): 使用pyplot示例代码改编自（使用python和PIL，如何在图像中获取文本块？）：

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
im = Image.open('/home/jmunsch/Pictures/Aet62.png')
p = np.array(im)
p = p[:,:,0:3]
p = 255 - p
lx,ly,lz = p.shape

plt.plot(p.sum(axis=1))
plt.plot(p.sum(axis=0))

#I was thinking something like this 
#The image is a 3-dimensional ndarray  [[x],[y],[color?]]
#Set each value below an axes mean to 0
[item = 0 for item in p[axis=0] if item < p.mean(axis=0)]

# and then some type of enumerated groupby for each axes
#finding the mean index for each groupby(0) on axes

plt.plot(p[mean_index1:mean_index2,mean_index3:mean_index4])

Based on the graphs each of the valleys would indicate a place to bound. 基于这些图，每个山谷将指示出一个绑定的地方。

The first graph shows where lines of text would be 第一张图显示了文本行的位置
The second graph shows where characters would be 第二张图显示了字符在哪里

Plot example `plt.plot(p.sum(axis=1))` : 绘图示例`plt.plot(p.sum(axis=1))` ：

在此处输入图片说明

Plot example output `plt.plot(p.sum(axis=0))` : 绘图示例输出`plt.plot(p.sum(axis=0))` ：

在此处输入图片说明

Related posts/docs: 相关文章/文档：

update: solution by HYRY 更新：HYRY解决方案

在此处输入图片说明

Answer 1

I think you can use Morphology functions in scipy.ndimage , here is an example: 我认为您可以在scipy.ndimage使用形态学功能，这是一个示例：

import pylab as pl
import numpy as np
from scipy import ndimage
img = pl.imread("Aet62.png")[:, :, 0].astype(np.uint8)
img2 = ndimage.binary_erosion(img, iterations=40)
img3 = ndimage.binary_dilation(img2, iterations=40)
labels, n = ndimage.label(img3)
counts = np.bincount(labels.ravel())
counts[0] = 0
img4 = labels==np.argmax(counts)
img5 = ndimage.binary_fill_holes(img4)
result = ~img & img5
result = ndimage.binary_erosion(result, iterations=3)
result = ndimage.binary_dilation(result, iterations=3)
pl.imshow(result, cmap="gray")

the output is: 输出为：

在此处输入图片说明

Numpy PIL Python：在空格上裁剪图像或使用直方图阈值裁剪文本

问题描述

Original image: 原始图片：

Goal: 目标：

Attempts: 尝试：

Plot example `plt.plot(p.sum(axis=1))` : 绘图示例`plt.plot(p.sum(axis=1))` ：

Plot example output `plt.plot(p.sum(axis=0))` : 绘图示例输出`plt.plot(p.sum(axis=0))` ：

update: solution by HYRY 更新：HYRY解决方案

1 个解决方案

解决方案1
5 已采纳 2014-07-11 11:57:31

Numpy PIL Python：在空格上裁剪图像或使用直方图阈值裁剪文本

问题描述

Original image: 原始图片：

Goal: 目标：

Attempts: 尝试：

Plot example plt.plot(p.sum(axis=1)) : 绘图示例plt.plot(p.sum(axis=1)) ：

Plot example output plt.plot(p.sum(axis=0)) : 绘图示例输出plt.plot(p.sum(axis=0)) ：

update: solution by HYRY 更新：HYRY解决方案

1 个解决方案

解决方案1 5 已采纳 2014-07-11 11:57:31

Plot example `plt.plot(p.sum(axis=1))` : 绘图示例`plt.plot(p.sum(axis=1))` ：

Plot example output `plt.plot(p.sum(axis=0))` : 绘图示例输出`plt.plot(p.sum(axis=0))` ：

解决方案1
5 已采纳 2014-07-11 11:57:31