![](/img/trans.png)
[英]Python Pandas: What is the fastest way to calculate days between two date?
[英]What is the fastest way to calculate sum of absolute differences between two images in Python?
我正在尝试在使用Pillow和(可选)Numpy的Python 3应用程序中比较图像。 出于兼容性原因,我不打算使用其他外部非纯Python软件包。 我在Roseta Code中发现了这种基于枕头的算法,虽然可以达到我的目的,但需要一些时间:
from PIL import Image
def compare_images(img1, img2):
"""Compute percentage of difference between 2 JPEG images of same size
(using the sum of absolute differences). Alternatively, compare two bitmaps
as defined in basic bitmap storage. Useful for comparing two JPEG images
saved with a different compression ratios.
Adapted from:
http://rosettacode.org/wiki/Percentage_difference_between_images#Python
:param img1: an Image object
:param img2: an Image object
:return: A float with the percentage of difference, or None if images are
not directly comparable.
"""
# Don't compare if images are of different modes or different sizes.
if (img1.mode != img2.mode) \
or (img1.size != img2.size) \
or (img1.getbands() != img2.getbands()):
return None
pairs = zip(img1.getdata(), img2.getdata())
if len(img1.getbands()) == 1:
# for gray-scale jpegs
dif = sum(abs(p1 - p2) for p1, p2 in pairs)
else:
dif = sum(abs(c1 - c2) for p1, p2 in pairs for c1, c2 in zip(p1, p2))
ncomponents = img1.size[0] * img1.size[1] * 3
return (dif / 255.0 * 100) / ncomponents # Difference (percentage)
尝试寻找替代方法时,我发现可以使用Numpy重写此函数:
import numpy as np
from PIL import Image
def compare_images_np(img1, img2):
if (img1.mode != img2.mode) \
or (img1.size != img2.size) \
or (img1.getbands() != img2.getbands()):
return None
dif = 0
for band_index, band in enumerate(img1.getbands()):
m1 = np.array([p[band_index] for p in img1.getdata()]).reshape(*img1.size)
m2 = np.array([p[band_index] for p in img2.getdata()]).reshape(*img2.size)
dif += np.sum(np.abs(m1-m2))
ncomponents = img1.size[0] * img1.size[1] * 3
return (dif / 255.0 * 100) / ncomponents # Difference (percentage)
我原本希望处理速度有所提高,但实际上需要更长的时间。 除了基础知识之外,我还没有Numpy的经验,所以我想知道是否有任何方法可以使它更快,例如使用某种不暗示for循环的算法。 有任何想法吗?
我想我知道您要做什么。 我不了解我们两台机器的相对性能,因此也许您可以自己对其进行基准测试。
from PIL import Image
import numpy as np
# Load images, convert to RGB, then to numpy arrays and ravel into long, flat things
a=np.array(Image.open('a.png').convert('RGB')).ravel()
b=np.array(Image.open('b.png').convert('RGB')).ravel()
# Calculate the sum of the absolute differences divided by number of elements
MAE = np.sum(np.abs(np.subtract(a,b,dtype=np.float))) / a.shape[0]
唯一的“棘手”之处是将np.subtract()
的结果类型强制为浮点数,以确保我可以存储负数。 可能值得在您的硬件上尝试使用dtype=np.int16
来看看它是否更快。
基准测试的一种快速方法如下。 启动ipython
,然后输入以下内容:
from PIL import Image
import numpy as np
a=np.array(Image.open('a.png').convert('RGB')).ravel()
b=np.array(Image.open('b.png').convert('RGB')).ravel()
现在,您可以使用以下时间来计时我的代码:
%timeit np.sum(np.abs(np.subtract(a,b,dtype=np.float))) / a.shape[0]
6.72 µs ± 21.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
或者,您可以尝试这样的int16
版本:
%timeit np.sum(np.abs(np.subtract(a,b,dtype=np.int16))) / a.shape[0]
6.43 µs ± 30.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
如果要计时代码,请粘贴函数,然后使用:
%timeit compare_images_pil(img1, img2)
仔细研究一下,我发现该存储库采用了另一种方法,该方法更多地基于Pillow本身,并且似乎给出了相似的结果。
from PIL import Image
from PIL import ImageChops, ImageStat
def compare_images_pil(img1, img2):
'''Calculate the difference between two images of the same size
by comparing channel values at the pixel level.
`delete_diff_file`: removes the diff image after ratio found
`diff_img_file`: filename to store diff image
Adapted from Nicolas Hahn:
https://github.com/nicolashahn/diffimg/blob/master/diffimg/__init__.py
'''
# Don't compare if images are of different modes or different sizes.
if (img1.mode != img2.mode) \
or (img1.size != img2.size) \
or (img1.getbands() != img2.getbands()):
return None
# Generate diff image in memory.
diff_img = ImageChops.difference(img1, img2)
# Calculate difference as a ratio.
stat = ImageStat.Stat(diff_img)
# Can be [r,g,b] or [r,g,b,a].
sum_channel_values = sum(stat.mean)
max_all_channels = len(stat.mean) * 255
diff_ratio = sum_channel_values / max_all_channels
return diff_ratio * 100
对于我的测试图像样本,结果似乎是相同的(除了一些较小的float舍入错误),并且其运行速度比我上面的第一个版本快得多。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.