[英]Image perspective transform using Pillow
I tried to draw bounding box of text on a image.The image is perspective-transformed with a given set of coefficients.我试图在图像上绘制文本的边界框。图像是使用一组给定的系数进行透视变换的。 The coordinates of text before transformation is known, and I want to calculate the coordinates of text after transformation.
变换前文本的坐标是已知的,我想计算变换后文本的坐标。
To my understanding if I apply perspective transformation with the coefficients used in image transform to the text coordinates, I will get the resulting coordinates of the text after transformation.据我所知,如果我将图像变换中使用的系数的透视变换应用于文本坐标,我将在变换后得到文本的结果坐标。 However, the text does not appear on the place it is supposed to be.
但是,文本没有出现在它应该出现的地方。
The smaller white box bounds the text well because I know the coordinates of the text.较小的白框很好地约束了文本,因为我知道文本的坐标。
The smaller white box is not bounding the text because of some error during transforming the coordinates.由于在转换坐标过程中出现一些错误,较小的白框没有限制文本。
I follow the documentation reference for coefficients of perspective transformation and find the coefficients of image transformation using the following code: origin of the code is from this answer我遵循透视变换系数的文档参考,并使用以下代码找到图像变换的系数:代码的来源来自这个答案
def find_coeffs(pa, pb):
'''
find the coefficients for perspective transform.
parameters:
pa : verticies in the resulting plane
pb : verticies in the current plane
retrun:
coeffs : 8- tuple
coefficents for PIL perspective transform
'''
matrix = []
for p1, p2 in zip(pa, pb):
matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])
A = np.matrix(matrix, dtype=np.float)
B = np.array(pb).reshape(8)
res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
return np.array(res).reshape(8)
My code for text bounding box transformation:我的文本边界框转换代码:
# perspective transformation
a, b, c, d, e, f, g, h = coeffs
# return two vertices defining the bounding box
new_x0 = float(a * new_x0 - b * new_y0 + c) / float(g * new_x0 + h * new_y0 + 1)
new_y0 = float(d * new_x0 + e * new_y0 + f) / float(g * new_x0 + h * new_y0 + 1)
new_x1 = float(a * new_x1 - b * new_y1 + c) / float(g * new_x1 + h * new_y1 + 1)
new_y1 = float(d * new_x1 + e * new_y1 + f) / float(g * new_x1 + h * new_y1 + 1)
I also went to Pillow Github, but I could not find the source code where perspective transformation is defined.我也去了 Pillow Github,但是找不到定义透视变换的源码。
Some more info about the math of perspective transformation.关于透视变换数学的更多信息。 The Geometry of Perspective Drawing on the Computer
在计算机上绘制透视图的几何图形
Thanks.谢谢。
To compute the new point after a transformation you should get the coefficients from A -> B not from B -> A, which is the standard from PIL library.要在转换后计算新点,您应该从 A -> B 而非 B -> A 中获取系数,这是 PIL 库的标准。 As example:
例如:
# A1, B1 ... are points
# direct transform
coefs = find_coefs([B1, B2, B3, B4], [A1, A2, A3, A4])
# inverse transform
coefs_inv = find_coefs([A1, A2, A3, A4], [B1, B2, B3, B4])
You call the image.transform()
function using the coefs_inv
but calculate the new point using coefs
to get something like this:您调用
image.transform()
函数使用coefs_inv
但使用计算新点coefs
得到的东西是这样的:
img = image.transform(((1500,800)),
method=Image.PERSPECTIVE,
data=coefs_inv)
a, b, c, d, e, f, g, h = coefs
old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))
old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))
Full code below:完整代码如下:
import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
def find_coefs(original_coords, warped_coords):
matrix = []
for p1, p2 in zip(original_coords, warped_coords):
matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])
A = np.matrix(matrix, dtype=np.float)
B = np.array(warped_coords).reshape(8)
res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
return np.array(res).reshape(8)
coefs = find_coefs(
[(867,652), (1020,580), (1206,666), (1057,757)],
[(700,732), (869,754), (906,916), (712,906)]
)
coefs_inv = find_coefs(
[(700,732), (869,754), (906,916), (712,906)],
[(867,652), (1020,580), (1206,666), (1057,757)]
)
image = Image.open('sample.png')
img = image.transform(((1500,800)),
method=Image.PERSPECTIVE,
data=coefs_inv)
a, b, c, d, e, f, g, h = coefs
old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))
old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))
plt.figure()
plt.imshow(image)
plt.scatter([old_p1[0], old_p2[0]],[old_p1[1], old_p2[1]] , s=150, marker='.', c='b')
plt.show()
plt.figure()
plt.imshow(img)
plt.scatter([new_p1[0], new_p2[0]],[new_p1[1], new_p2[1]] , s=150, marker='.', c='r')
plt.show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.