使用 Pillow 进行图像透视变换

Question

我试图在图像上绘制文本的边界框。图像是使用一组给定的系数进行透视变换的。 变换前文本的坐标是已知的，我想计算变换后文本的坐标。

据我所知，如果我将图像变换中使用的系数的透视变换应用于文本坐标，我将在变换后得到文本的结果坐标。 但是，文本没有出现在它应该出现的地方。

见下图

较小的白框很好地约束了文本，因为我知道文本的坐标。

由于在转换坐标过程中出现一些错误，较小的白框没有限制文本。

我遵循透视变换系数的文档参考，并使用以下代码找到图像变换的系数：代码的来源来自这个答案

def find_coeffs(pa, pb):
    '''
    find the coefficients for perspective transform. 

    parameters:
        pa : verticies in the resulting plane
        pb : verticies in the current plane

    retrun:
        coeffs : 8- tuple
          coefficents for PIL perspective transform
    '''
    matrix = []
    for p1, p2 in zip(pa, pb):
        matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
        matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])

    A = np.matrix(matrix, dtype=np.float)
    B = np.array(pb).reshape(8)
    res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
    return np.array(res).reshape(8)

我的文本边界框转换代码：

    # perspective transformation
    a, b, c, d, e, f, g, h = coeffs
    # return two vertices defining the bounding box

    new_x0 = float(a * new_x0 - b * new_y0 + c) / float(g * new_x0 + h * new_y0 + 1)
    new_y0 = float(d * new_x0 + e * new_y0 + f) / float(g * new_x0 + h * new_y0 + 1)
    new_x1 = float(a * new_x1 - b * new_y1 + c) / float(g * new_x1 + h * new_y1 + 1)
    new_y1 = float(d * new_x1 + e * new_y1 + f) / float(g * new_x1 + h * new_y1 + 1)

我也去了 Pillow Github，但是找不到定义透视变换的源码。

关于透视变换数学的更多信息。 在计算机上绘制透视图的几何图形

谢谢。

Answer 1

要在转换后计算新点，您应该从 A -> B 而非 B -> A 中获取系数，这是 PIL 库的标准。 例如：

# A1, B1 ... are points
# direct transform
coefs = find_coefs([B1, B2, B3, B4], [A1, A2, A3, A4])

# inverse transform
coefs_inv = find_coefs([A1, A2, A3, A4], [B1, B2, B3, B4])

您调用image.transform()函数使用coefs_inv但使用计算新点coefs得到的东西是这样的：

img = image.transform(((1500,800)),
                      method=Image.PERSPECTIVE,
                      data=coefs_inv)

a, b, c, d, e, f, g, h = coefs

old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))

old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))

完整代码如下：

import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt


def find_coefs(original_coords, warped_coords):
        matrix = []
        for p1, p2 in zip(original_coords, warped_coords):
            matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
            matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])

        A = np.matrix(matrix, dtype=np.float)
        B = np.array(warped_coords).reshape(8)

        res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
        return np.array(res).reshape(8)


coefs = find_coefs(
                  [(867,652), (1020,580), (1206,666), (1057,757)],
                  [(700,732), (869,754), (906,916), (712,906)]
                  )

coefs_inv = find_coefs(
                  [(700,732), (869,754), (906,916), (712,906)],
                  [(867,652), (1020,580), (1206,666), (1057,757)]
                  )

image = Image.open('sample.png')

img = image.transform(((1500,800)),
                      method=Image.PERSPECTIVE,
                      data=coefs_inv)

a, b, c, d, e, f, g, h = coefs

old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))

old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))



plt.figure()
plt.imshow(image)
plt.scatter([old_p1[0], old_p2[0]],[old_p1[1], old_p2[1]]  , s=150, marker='.', c='b')
plt.show()


plt.figure()
plt.imshow(img)
plt.scatter([new_p1[0], new_p2[0]],[new_p1[1], new_p2[1]]  , s=150, marker='.', c='r')

plt.show()

使用 Pillow 进行图像透视变换

问题描述

1 个解决方案

解决方案1
0 2020-05-22 04:17:03

使用 Pillow 进行图像透视变换

问题描述

1 个解决方案

解决方案1 0 2020-05-22 04:17:03

解决方案1
0 2020-05-22 04:17:03