简体   繁体   English

使用 Pillow 进行图像透视变换

[英]Image perspective transform using Pillow

I tried to draw bounding box of text on a image.The image is perspective-transformed with a given set of coefficients.我试图在图像上绘制文本的边界框。图像是使用一组给定的系数进行透视变换的。 The coordinates of text before transformation is known, and I want to calculate the coordinates of text after transformation.变换前文本的坐标是已知的,我想计算变换后文本的坐标。

To my understanding if I apply perspective transformation with the coefficients used in image transform to the text coordinates, I will get the resulting coordinates of the text after transformation.据我所知,如果我将图像变换中使用的系数的透视变换应用于文本坐标,我将在变换后得到文本的结果坐标。 However, the text does not appear on the place it is supposed to be.但是,文本没有出现在它应该出现的地方。

See the following graphs见下图变换前的图

The smaller white box bounds the text well because I know the coordinates of the text.较小的白框很好地约束了文本,因为我知道文本的坐标。

变换后的图

The smaller white box is not bounding the text because of some error during transforming the coordinates.由于在转换坐标过程中出现一些错误,较小的白框没有限制文本。

I follow the documentation reference for coefficients of perspective transformation and find the coefficients of image transformation using the following code: origin of the code is from this answer我遵循透视变换系数的文档参考,并使用以下代码找到图像变换的系数:代码的来源来自这个答案

def find_coeffs(pa, pb):
    '''
    find the coefficients for perspective transform. 

    parameters:
        pa : verticies in the resulting plane
        pb : verticies in the current plane

    retrun:
        coeffs : 8- tuple
          coefficents for PIL perspective transform
    '''
    matrix = []
    for p1, p2 in zip(pa, pb):
        matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
        matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])

    A = np.matrix(matrix, dtype=np.float)
    B = np.array(pb).reshape(8)
    res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
    return np.array(res).reshape(8)

My code for text bounding box transformation:我的文本边界框转换代码:

    # perspective transformation
    a, b, c, d, e, f, g, h = coeffs
    # return two vertices defining the bounding box

    new_x0 = float(a * new_x0 - b * new_y0 + c) / float(g * new_x0 + h * new_y0 + 1)
    new_y0 = float(d * new_x0 + e * new_y0 + f) / float(g * new_x0 + h * new_y0 + 1)
    new_x1 = float(a * new_x1 - b * new_y1 + c) / float(g * new_x1 + h * new_y1 + 1)
    new_y1 = float(d * new_x1 + e * new_y1 + f) / float(g * new_x1 + h * new_y1 + 1) 

I also went to Pillow Github, but I could not find the source code where perspective transformation is defined.我也去了 Pillow Github,但是找不到定义透视变换的源码。

Some more info about the math of perspective transformation.关于透视变换数学的更多信息。 The Geometry of Perspective Drawing on the Computer在计算机上绘制透视图的几何图形

Thanks.谢谢。

To compute the new point after a transformation you should get the coefficients from A -> B not from B -> A, which is the standard from PIL library.要在转换后计算新点,您应该从 A -> B 而非 B -> A 中获取系数,这是 PIL 库的标准。 As example:例如:

# A1, B1 ... are points
# direct transform
coefs = find_coefs([B1, B2, B3, B4], [A1, A2, A3, A4])

# inverse transform
coefs_inv = find_coefs([A1, A2, A3, A4], [B1, B2, B3, B4])

You call the image.transform() function using the coefs_inv but calculate the new point using coefs to get something like this:您调用image.transform()函数使用coefs_inv但使用计算新点coefs得到的东西是这样的:

img = image.transform(((1500,800)),
                      method=Image.PERSPECTIVE,
                      data=coefs_inv)

a, b, c, d, e, f, g, h = coefs

old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))

old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))

PIL透视

Full code below:完整代码如下:

import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt


def find_coefs(original_coords, warped_coords):
        matrix = []
        for p1, p2 in zip(original_coords, warped_coords):
            matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
            matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])

        A = np.matrix(matrix, dtype=np.float)
        B = np.array(warped_coords).reshape(8)

        res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
        return np.array(res).reshape(8)


coefs = find_coefs(
                  [(867,652), (1020,580), (1206,666), (1057,757)],
                  [(700,732), (869,754), (906,916), (712,906)]
                  )

coefs_inv = find_coefs(
                  [(700,732), (869,754), (906,916), (712,906)],
                  [(867,652), (1020,580), (1206,666), (1057,757)]
                  )

image = Image.open('sample.png')

img = image.transform(((1500,800)),
                      method=Image.PERSPECTIVE,
                      data=coefs_inv)

a, b, c, d, e, f, g, h = coefs

old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))

old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))



plt.figure()
plt.imshow(image)
plt.scatter([old_p1[0], old_p2[0]],[old_p1[1], old_p2[1]]  , s=150, marker='.', c='b')
plt.show()


plt.figure()
plt.imshow(img)
plt.scatter([new_p1[0], new_p2[0]],[new_p1[1], new_p2[1]]  , s=150, marker='.', c='r')

plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM