简体   繁体   中英

Finding all the bounding rectangles of all non-transparent regions in PIL

I have a transparent-background image with some non-transparent text.

And I want to find all the bounding boxes of each individual word in the text.

Here is the code about creating a transparent image and draw some text ("Hello World", for example) , after that, do affine transform and thumbnail it.

from PIL import Image, ImageFont, ImageDraw, ImageOps
import numpy as np

fontcolor = (255,255,255)
fontsize  = 180
# padding rate for setting the image size of font
fimg_padding = 1.1
# check code bbox padding rate
bbox_gap = fontsize * 0.05
# Rrotation +- N degree

# Choice a font type for output---
font = ImageFont.truetype('Fonts/Bebas.TTF', fontsize)

# the text is "Hello World"
code = "Hello world"
# Get the related info of font---
code_w, code_h = font.getsize(code)

# Setting the image size of font---
img_size = int((code_w) * fimg_padding)

# Create a RGBA image with transparent background
img = Image.new("RGBA", (img_size,img_size),(255,255,255,0))
d = ImageDraw.Draw(img)

# draw white text
code_x = (img_size-code_w)/2
code_y = (img_size-code_h)/2
d.text( ( code_x, code_y ), code, fontcolor, font=font)
# img.save('initial.png')

# Transform the image---
img = img_transform(img)

# crop image to the size equal to the bounding box of whole text
alpha = img.split()[-1]
img = img.crop(alpha.getbbox())

# resize the image
img.thumbnail((512,512), Image.ANTIALIAS)

# img.save('myimage.png')

# what I want is to find all the bounding box of each individual word
boxes=find_all_bbx(img)

Here is the code about affine transform (provided here for those who want to do some experiment)

def find_coeffs(pa, pb):
    matrix = []
    for p1, p2 in zip(pa, pb):
        matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
        matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])

    A = np.matrix(matrix, dtype=np.float)
    B = np.array(pb).reshape(8)

    res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
    return np.array(res).reshape(8)

def rand_degree(st,en,gap):
    return (np.fix(np.random.random()* (en-st) * gap )+st)

def img_transform(img):
    width, height = img.size
    print  img.size
    m = -0.5
    xshift = abs(m) * width
    new_width = width + int(round(xshift))
    img = img.transform((new_width, height), Image.AFFINE,
            (1, m, -xshift if m > 0 else 0, 0, 1, 0), Image.BICUBIC)

    range_n = width*0.2
    gap_n = 1

    x1 = rand_degree(0,range_n,gap_n)
    y1 = rand_degree(0,range_n,gap_n)

    x2 = rand_degree(width-range_n,width,gap_n)
    y2 = rand_degree(0,range_n,gap_n)

    x3 = rand_degree(width-range_n,width,gap_n)
    y3 = rand_degree(height-range_n,height,gap_n)

    x4 = rand_degree(0,range_n,gap_n)
    y4 = rand_degree(height-range_n,height,gap_n)

    coeffs = find_coeffs(
             [(x1, y1), (x2, y2), (x3, y3), (x4, y4)],
            [(0, 0), (width, 0), (new_width, height), (xshift, height)])

    img = img.transform((width, height), Image.PERSPECTIVE, coeffs, Image.BICUBIC)
    return img

How to implement find_all_bbx to find the bounding box of each individual word?

For example, one of the box can be found in 'H' ( you can download the image to see the partial result).

结果

For what you want to do you need to label the individual words and then compute the bounding box of each object with the same label. The most straigh forward approach here is just taking the min and max positions of the pixels that make up that word. The labeling is a little bit more difficult. For example you could use a morphological operation to combine the letters of the words (morphological opening, see PIL documentation ) and then use ImageDraw.floodfill . Or you could try to anticipate the positions of the words from the position where you first draw the text code_x and code_y and the chosen font and size of the letters and the spacing (this will trickier I think).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM