简体   繁体   English

用于读取 tiff 文件的高性能(python)库?

[英]High performance (python) library for reading tiff files?

I am using code to read a.tiff file in order to calculate a fractal dimension.我正在使用代码读取 a.tiff 文件以计算分形维数。 My code looks like this:我的代码如下所示:

import matplotlib.pyplot as plt

raster = plt.imread('xyz.tif')

for i in range(x1, x2):
    for j in range(y1, y2):
        pixel = raster[i][j]

This works, but I have to read a lot of pixels so I would like this to be fast, and ideally minimize electricity usage given current events.这行得通,但我必须读取大量像素,所以我希望它速度快,并且理想情况下最大限度地减少当前事件的用电量。 Is there a better library than matplotlib for this purpose?为此,是否有比 matplotlib 更好的图书馆? For example, could using a library specialized for matrix operations such as pandas help?例如,使用专门用于矩阵运算的库(例如 pandas)会有帮助吗? Additionally, would another language such as C have better performance than python?此外,另一种语言(例如 C)的性能是否会优于 python?

Edit: @cgohlke in the comments and others have found that cv2 is slower than tifffile for large and/or compressed images.编辑:评论中的@cgohlke 和其他人发现,对于大型和/或压缩图像,cv2 比 tifffile 慢。 It is best you test the different options on realistic data for your application.最好在应用程序的实际数据上测试不同的选项。

I have found cv2 to be the fastest library for this.我发现cv2是最快的库。 Using 5000 128x128 uint16 tif images gives the following result:使用 5000 个 128x128 uint16 tif 图像给出以下结果:

import time
import matplotlib.pyplot as plt
t0 = time.time()
for file in files:
    raster = plt.imread(file)
print(f'{time.time()-t0:.2f} s')

1.52 s 1.52 秒

import time
from PIL import Image
t0 = time.time()
for file in files:
    im = np.array(Image.open(file))
print(f'{time.time()-t0:.2f} s')

1.42 s 1.42 秒

import time
import tifffile
t0 = time.time()
for file in files:
    im = tifffile.imread(file)
print(f'{time.time()-t0:.2f} s')

1.25 s 1.25 秒

import time
import cv2
t0 = time.time()
for file in files:
    im = cv2.imread(file, cv2.IMREAD_UNCHANGED)
print(f'{time.time()-t0:.2f} s')

0.20 s 0.20 秒

cv2 is a computer vision library written in c++, which as the other commenter mentioned is much faster than pure python. Note the cv2.IMREAD_UNCHANGED flag, otherwise cv2 will convert monochrome images to 8-bit rgb. cv2是用 c++ 编写的计算机视觉库,正如其他评论者提到的那样,它比纯 python 快得多。请注意cv2.IMREAD_UNCHANGED标志,否则cv2会将单色图像转换为 8 位 rgb。

I am not sure which library is the fastest but I have very good experience with Pillow:我不确定哪个库最快,但我对 Pillow 有很好的体验:

from PIL import Image
raster = Image.open('xyz.tif')

then you could convert it to a numpy array:那么您可以将其转换为 numpy 数组:

import numpy
pixels = numpy.array(raster)

I would need to see the rest of the code to be able to recommend any other libraries.我需要查看代码的 rest 才能推荐任何其他库。 As for the language C++ or C would have better performance as they are low level languages.至于语言 C++ 或 C 将具有更好的性能,因为它们是低级语言。 So depends on how complex your operations are and how much data you need to process, C++ scripts were shown to be 10-200x faster(increasing with the complexity of calculations).因此,取决于您的操作有多复杂以及您需要处理多少数据,C++ 脚本被证明要快 10-200 倍(随着计算的复杂性而增加)。 Hope this helps if you have any further questions just ask.如果您有任何进一步的问题,希望这会有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM