简体   繁体   English

Python - 删除来自图像边界的黑色像素

[英]Python - Remove Black Pixels originatin from the border of an image

I am very new to Image processing and I am trying to cleanse pictures similar to picture 1 of the Black Pixels originating from the border of the Image.我对图像处理非常陌生,我正在尝试清理类似于源自图像边界的黑色像素的图片 1 的图片。

使用 PyMuPDF 截取字符的图像

The Images are clipped Characters from a PDF which I try to process with tesseract to retieve the character.图像是来自 PDF 的剪辑字符,我尝试使用 tesseract 处理以检索字符。 I already searched in Stackoverflow for answers, but only found resolutions to get rid of black borders.我已经在 Stackoverflow 中搜索了答案,但只找到了摆脱黑色边框的解决方案。 I need to overwrite all the black pixels from the corners with white pixels, so tesseract can correctly recognize the character.我需要用白色像素覆盖角落的所有黑色像素,以便 tesseract 可以正确识别字符。

I cannot alter the Bounding Boxes used to clip the Characters, since the characters are centered in different ares of the BoundingBox and if i Cut the BoundingBox, i would cut some Characters like seen below我无法更改用于剪辑字符的边界框,因为字符位于边界框的不同区域的中心,如果我剪切边界框,我会剪切一些字符,如下所示

已调整 BoundingBox 以适应之前看到的图像的字符的剪辑图像

My first guess would have been to recursively track down pixels with a certain threshhold of black in them, but I am scared of computing time in that case and wouldn't really know where and how to start, except for using two two-dimensional arrays, one with the pixels, and one with an indicator whether i already worked on that pixel or not.我的第一个猜测是递归地追踪具有一定黑色阈值的像素,但我害怕在这种情况下计算时间,并且真的不知道从哪里开始以及如何开始,除了使用两个二维 arrays ,一个带有像素,一个带有指示我是否已经在该像素上工作过。

Help would be greatly appreciated.帮助将不胜感激。

Edit: some more pictures of cases, where black pixels from the edge need to be cleared:编辑:更多案例图片,需要清除边缘的黑色像素:

在此处输入图像描述 在此处输入图像描述 在此处输入图像描述 在此处输入图像描述 在此处输入图像描述

Edit: Code-Snippet to create Border Image:编辑:创建边框图像的代码片段:

    @staticmethod
    def __get_border_image(image: Image) -> Image:
        data = numpy.asarray(image)

        border = cv2.copyMakeBorder(data, top=5, bottom=5, left=5, right=5, borderType=cv2.BORDER_CONSTANT)

        return Image.fromarray(border)

Try like this:试试这样:

  • artificially add a 1px wide black border all around the edge人为地在边缘周围添加一个 1px 宽的黑色边框
  • flood-fill with white all black pixels starting at top-left corner从左上角开始用白色全黑像素填充
  • remove the 1px border from the first step (if necessary)从第一步中删除 1px 边框(如有必要)

The point of adding the border is to allow the white to "flow" all around all edges of the image and reach any black items touching the edge.添加边框的目的是让白色围绕图像的所有边缘“流动”并到达任何接触边缘的黑色项目。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM