简体   繁体   English

如何在不将整个图像加载到内存中的情况下将单个 tif 图像部分加载到 numpy 数组中?

[英]how can I load a single tif image in parts into numpy array without loading the whole image into memory?

so There is a 4GB .TIF image that needs to be processed, as a memory constraint I can't load the whole image into numpy array so I need to load it lazily in parts from hard disk.所以有一个 4GB 的 .TIF 图像需要处理,作为内存限制,我无法将整个图像加载到 numpy 数组中,所以我需要从硬盘中延迟加载它。 so basically I need and that needs to be done in python as the project requirement.所以基本上我需要并且需要在python中完成作为项目要求。 I also tried looking for tifffile library in PyPi tifffile but I found nothing useful please help.我也尝试在 PyPi tifffile 中寻找 tifffile 库,但我发现没有任何用处,请帮忙。

pyvips can do this. pyvips可以做到这一点。 For example:例如:

import sys
import numpy as np
import pyvips

image = pyvips.Image.new_from_file(sys.argv[1], access="sequential")

for y in range(0, image.height, 100):
    area_height = min(image.height - y, 100)
    area = image.crop(0, y, image.width, area_height)
    array = np.ndarray(buffer=area.write_to_memory(),
                       dtype=np.uint8,
                       shape=[area.height, area.width, area.bands])

The access option to new_from_file turns on sequential mode: pyvips will only load pixels from the file on demand, with the restriction that you must read pixels out top to bottom. new_from_fileaccess选项打开了顺序模式:pyvips 将仅根据需要从文件加载像素,限制是您必须从上到下读取像素。

The loop runs down the image in blocks of 100 scanlines.循环以 100 条扫描线为单位向下运行图像。 You can tune this, of course.当然,您可以对此进行调整。

I can run it like this:我可以像这样运行它:

$ vipsheader eso1242a-pyr.tif 
eso1242a-pyr.tif: 108199x81503 uchar, 3 bands, srgb, tiffload_stream
$ /usr/bin/time -f %M:%e ./sections.py ~/pics/eso1242a-pyr.tif
273388:479.50

So on this sad old laptop it took 8 minutes to scan a 108,000 x 82,000 pixel image and needed a peak of 270mb of memory.因此,在这台悲伤的旧笔记本电脑上,扫描 108,000 x 82,000 像素的图像需要 8 分钟,并且需要 270 MB 的内存峰值。

What processing are you doing?你在做什么处理? You might be able to do the whole thing in pyvips.您也许可以在 pyvips 中完成所有操作。 It's quite a bit quicker than numpy.它比numpy快得多。

import pyvips
img = pyvips.Image.new_from_file("space.tif", access='sequential')
out = img.resize(0.01, kernel = "linear")
out.write_to_file("resied_image.jpg")

if you want to convert the file to other format have a smaller size this code will be enough and will help you do it without without any memory spike and in very less time...如果您想将文件转换为其他尺寸较小的格式,此代码就足够了,并且可以帮助您在没有任何内存峰值的情况下完成此操作,并且时间很短...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM