简体   繁体   中英

How can I scale an image with PIL without ruining its appearence?

I noticed that simply multyplying and dividing an array even for coefficents equivalent to 1, will deform the image.

I need to rescale the image pixels because I need to feed them to a ML model, but I noticed there seems to be a huge loss of information in the process.

This is the original image (an example):

Image.fromarray((np.array(out_img.resize((224, 224)))),'L')

原图

If I divide it by 255, it somehow ends up like this:

Image.fromarray((np.array(out_img.resize((224, 224)))/255),'L')

在此处输入图像描述

A lot of information seems lost, and apparently I can't revert back to the original:

(np.array(out_img.resize((224, 224)))/255*255==np.array(out_img.resize((224, 224)))).all()    
Image.fromarray((np.array(out_img.resize((224, 224)))/255*255),'L')

在此处输入图像描述 If you see I checked that multiplying and dividing by 255 will give us back the same array, but the images look different.

The same happens even if I naively divide and multiply by 1:

Image.fromarray((np.array(out_img.resize((224, 224)))*(1/1)),'L')

在此处输入图像描述

Is there an explanation for this behaviour or a way to prevent the information loss?

But you can't create a 'L' mode PIL image from a float array. PIL.Image.fromarray just plug the data from the passed array into an image. Those data, with L mode, are supposed to be bytes. And what you gave are floats.

See following example

from PIL import Image
import numpy as np
img = np.array([[3.141592653589793, 12, 13, 14], [15, 16, 17, 18]])
limg=np.array(Image.fromarray(img, 'L'))

limg is now an array of the same shape as img. Since PIL image built from img has this resolution. But data are 8 bytes (since there are 8 pixels, and we said that format is L), that are the 8 first bytes taken from img .

See

img.tobytes()
# b'\x18-DT\xfb!\t@\x00\x00\x00\x00\x00\x00(@\x00\x00\x00\x00\x00\x00*@\x00\x00\x00\x00\x00\x00,@\x00\x00\x00\x00\x00\x00.@\x00\x00\x00\x00\x00\x000@\x00\x00\x00\x00\x00\x001@\x00\x00\x00\x00\x00\x002@'
limg.tobytes()
# b'\x18-DT\xfb!\t@'

We can even try to decode that

import struct
limg
# array([[ 24,  45,  68,  84],
#       [251,  33,   9,  64]], dtype=uint8)
struct.pack('BBBBBBBB', 24, 45, 68, 84, 251, 33, 9, 64)
# b'\x18-DT\xfb!\t@'
# See, it is the same things. Just the bytes of limg, that is the 8 1st bytes of img, shown as uint8

struct.unpack('d', struct.pack('BBBBBBBB', 24, 45, 68, 84, 251, 33, 9, 64))
# (3.141592653589793,)
# See, it is just the 8 bytes of float representation of the first float in img (the 7 other are lost

So, the image you have here are image of the bytes of the float data (of the 1st 8th of the float data, since there is no room for more). Each group of 8 pixels are the 8 bytes of a float.

Same occurs for any operation that turn the ndarray of uint8 into a ndarray of float. Including multiplying by (1/1) .

Solution

I don't know what ML model you use. I doubt it requires PIL images. So, you could pass it ndarray. Including floats if needed.

If you really need to use PIL image, then you could use 'F' mode instead of 'L' (which means 8 bits grayscale).

Note that if you just hadn't passed 'L' argument to fromarray , it would have guessed by itself the mode (grayscale because of the H×W shape — not H×W×3 that would be RGB, or H×W×2 that would be LA,... — 'F' because of the float dtype)

Also note that your question has nothing to do with scaling. You would have had the exact same problem without any resize . Image.fromarray(np.array(img)*1.0, 'L') would have the same problem. This is not a scaling quality problem. It is an image format, even a data format, problem; you are using memory that contains floats and ask PIL to interpret it as if it were containing uint8.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM