简体   繁体   中英

how numpy.ndarray can be normalized?

I am working with numpy.ndarray including 286 images with the shape of (286, 16, 16, 3) . Each image contains 3 bands with varying pixel values with float32 data types. The maximum value of pixel value in each band can be more than 255. Is it possible to normalize this numpy.ndarray between [0-1]?

code for reading the images:

inputPath='E:/Notebooks/data'

images = []

# Load in the images
for filepath in os.listdir(inputPath):
    images.append(cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)))

If you want the range of values of every image to be between 0 and 255, you could loop over the images, calculate min and max of the original image and squeeze them, so the minimum is 0 and the maximum is 255.

import numpy as np
#images = np.random.rand(286,16,16,3)
images = np.random.rand(286,16,16,3).astype(np.float32)

for nr,img in enumerate(images):
    min = np.min(img)
    max = np.max(img)
#   images[nr] = (img - min) * (255/(max-min))
    images[nr] = (img - min) / (max - min) * 255

Vectorized is much faster than iterative

If you want to scale the pixel values of all your images using numpy arrays only, you may want to keep the vectorized nature of the operation (by avoiding loops).

Here is a way to scale your images:

# Getting min and max per image
maxis = images.max(axis=(1,2,3))
minis = images.min(axis=(1,2,3))
# Scaling without any loop
scaled_images = ((images.T - minis) / (maxis - minis) * 255).T
# timeit > 178 µs ± 1.24 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

The transposes .T were necessary here to broadcast correctly the subtraction.

We can check if this is correct:

print((scaled_images.min(axis=(1,2,3)) == 0).all())
# > True
print((scaled_images.max(axis=(1,2,3)) == 255).all())
# > True

Scaling into the [0, 1] range

If you want pixel values between 0 and 1 , we simply remove the x255 multiplication:

scaled_images = ((images.T - minis) / (maxis - minis)).T

Only with numpy arrays and such

You must also make sure you are handling a numpy array in the first place, not a list :

import numpy as np
images = np.array(images)

OpenCV

On-the-go scaling

Since you are using opencv to read your images one by one, you can normalize your images on the go with it:

inputPath='E:/Notebooks/data'

max_scale = 1   # or 255 if needed
# Load in the images 
images = [cv2.normalize(
    cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)),
    None, 0, max_scale, cv2.NORM_MINMAX)
    for filepath in os.listdir(inputPath)]

Make sure you have images in the folder

inputPath='E:/Notebooks/data'
images = []

max_scale = 1   # or 255 if needed

# Load in the images 
for filepath in os.listdir(inputPath):
    image = cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH))
    # Scale and append the list if it is an image
    if image is not None:
        images.append(cv2.normalize(image, None, 0, max_scale, cv2.NORM_MINMAX))

Bug on versions of open-cv prior to 3.4

As reported here , there is a bug with opencv's normalize method producing values below the alpha parameter . It was corrected on version 3.4.

Here is a way to scale images on-the-go with older versions of open-cv:

def custom_scale(img, max_scale=1):
    mini = img.min()
    return (img - mini) / (img.max() - mini) * max_scale

max_scale = 1   # or 255 if needed

images = [custom_scale(
    cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)), max_scale)
    for filepath in os.listdir(inputPath)]

I've figured out this piece of code:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Jun  8 13:19:17 2021

@author: Pietro


https://stackoverflow.com/questions/67885596/how-numpy-ndarray-can-be-normalized

"""


import numpy as np

arrayz = np.array(np.random.randn(286,16,16,3), dtype=np.float32)

print(arrayz.shape)

print((arrayz.size))

print(arrayz[0,0,0,:],'            ',type(arrayz[0,0,0,:]))
print(arrayz[0,0,0,0],'            ',type(arrayz[0,0,0,0]))

print(np.min(arrayz),'     ',np.max(arrayz))


print(np.min(arrayz),'     ',np.max(arrayz))

arrayz_split = np.split(arrayz,286,0)

print(type(arrayz_split))

for i in arrayz_split:
    print(i.size,'  ', i.shape,'  ',  np.min(i),'   ', np.max(i))

arrayz_split_flat = []

for i in arrayz_split:
    ii = i[0]
    arrayz_split_flat.append(ii)
    
for i in arrayz_split_flat:
    print(type(i),'  ',i.size,'  ', i.shape,'  ',  np.min(i),'   ', np.max(i))
    
arrayz_split_flat_norm = []



for i in arrayz_split_flat:
      minz = np.min(i)
      manz = np.max(i)
      ii = ((i-minz)/(manz-minz)*255).astype(np.uint8)
      
      arrayz_split_flat_norm.append(ii)

for i in arrayz_split_flat_norm:
    
    print(type(i),'  ',i.size,'  ', i.shape,'  ',  np.min(i),'   ', np.max(i))

out_arr1 = np.stack((arrayz_split_flat_norm), axis = 0) 

print(type(out_arr1), out_arr1.size, '  ', out_arr1.shape, ' ',np.min(out_arr1),np.max(out_arr1), out_arr1[0,0,0,:],out_arr1[0,0,0,0])

I don't understand why:

arrayz = np.array(np.random.randn(286,16,16,3), dtype=np.float32)

seems to work while using:

arrayz1 = np.ndarray((286,16,16,3), dtype="float32")
arrayz = np.nan_to_num(arrayz1)

works but throwing an:

 RuntimeWarning: overflow encountered in float_scalars
  ii = ((i-minz)/(manz-minz)*255).astype(np.uint8)
RuntimeWarning: invalid value encountered in true_divide
  ii = ((i-minz)/(manz-minz)*255).astype(np.uint8)

and I end up whit a series of 16x16x3 arrays full of zeroes

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM