简体   繁体   中英

how can I load a single tif image in parts into numpy array without loading the whole image into memory?

so There is a 4GB .TIF image that needs to be processed, as a memory constraint I can't load the whole image into numpy array so I need to load it lazily in parts from hard disk. so basically I need and that needs to be done in python as the project requirement. I also tried looking for tifffile library in PyPi tifffile but I found nothing useful please help.

pyvips can do this. For example:

import sys
import numpy as np
import pyvips

image = pyvips.Image.new_from_file(sys.argv[1], access="sequential")

for y in range(0, image.height, 100):
    area_height = min(image.height - y, 100)
    area = image.crop(0, y, image.width, area_height)
    array = np.ndarray(buffer=area.write_to_memory(),
                       dtype=np.uint8,
                       shape=[area.height, area.width, area.bands])

The access option to new_from_file turns on sequential mode: pyvips will only load pixels from the file on demand, with the restriction that you must read pixels out top to bottom.

The loop runs down the image in blocks of 100 scanlines. You can tune this, of course.

I can run it like this:

$ vipsheader eso1242a-pyr.tif 
eso1242a-pyr.tif: 108199x81503 uchar, 3 bands, srgb, tiffload_stream
$ /usr/bin/time -f %M:%e ./sections.py ~/pics/eso1242a-pyr.tif
273388:479.50

So on this sad old laptop it took 8 minutes to scan a 108,000 x 82,000 pixel image and needed a peak of 270mb of memory.

What processing are you doing? You might be able to do the whole thing in pyvips. It's quite a bit quicker than numpy.

import pyvips
img = pyvips.Image.new_from_file("space.tif", access='sequential')
out = img.resize(0.01, kernel = "linear")
out.write_to_file("resied_image.jpg")

if you want to convert the file to other format have a smaller size this code will be enough and will help you do it without without any memory spike and in very less time...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM