简体   繁体   中英

Python normalise floats in a list of lists to range from 0.0 (smallest) to 1.0 (largest) in each sublist

Trying to normalize a list of lists I have below:

[[7.460143566, 9.373718262, 9.540244102, 9.843519211, 9.034710884, 10.71182728], [0.490880072, 0.637698293, 0.806753874, 0.906699121, 0.697924912, 0.949957848], [52.33952713, 69.05165863, 65.69918823, 67.53870392, 65.12568665, 72.78334045]]

into below:

[[0.0, 0.3435355, 0.565656, 0.6576767, 1.0], [0.0, 0.232424, 0.465664, 0.76768, 1.0], [0.0, 0.24534535, 0.4564545, 0.576576, 1.0]]

I was trying

normalized = (col_list_filter-min(col_list_filter))/(max(col_list_filter)-min(col_list_filter))
    print(normalized)

But keep getting TypeError unsupported operand type(s) for -: 'list' and 'list'

Assuming col_list_filter is the list of lists, both col_list_filter-min(col_list_filter) and max(col_list_filter)-min(col_list_filter) are list - list just as stated in the error message.

Instead, you can do an element-wise operation using for loop:

res = []
for i in l:
    max_, min_ = max(i), min(i)
    res.append([(j - min_)/(max_ - min_) for j in i])
res

Or one-liner (but much less efficient):

[[(j - min(i))/(max(i) - min(i)) for j in i] for i in l]

Output:

[[0.0,
  0.5884873389626358,
  0.6396995276767564,
  0.7329666273317014,
  0.4842313879485761,
  1.0],
 [0.0,
  0.3198112142984678,
  0.688061628145554,
  0.9057703742992778,
  0.4510016620800218,
  1.0],
 [0.0,
  0.8174664500409363,
  0.6534818573661288,
  0.7434609459640676,
  0.625429283659689,
  1.0]]

This is some code to normalize just a list of lists:

a = [2,4,10,6,8,4]
amin, amax = min(a), max(a)
for i, val in enumerate(a):
    a[i] = (val-amin) / (amax-amin)

credit: https://scipython.com/book/chapter-2-the-core-python-language-i/questions/normalizing-a-list/

Try see if you can try apply this logic to a list of lists.

Give it a go yourself and let me know if you get stuck :)

You are working on list of lists. So you can use a nested list comprehension:

a = [[7.460143566, 9.373718262, 9.540244102, 9.843519211, 9.034710884, 10.71182728], [0.490880072, 0.637698293, 0.806753874, 0.906699121, 0.697924912, 0.949957848], [52.33952713, 69.05165863, 65.69918823, 67.53870392, 65.12568665, 72.78334045]]

b = [[(x-min(l))/(max(l)-min(l)) for x in l] for l in a]

print (b)

Result:

[[0.0, 0.5884873389626358, 0.6396995276767564, 0.7329666273317014, 0.4842313879485761, 1.0], 
[0.0, 0.3198112142984678, 0.688061628145554, 0.9057703742992778, 0.4510016620800218, 1.0],
[0.0, 0.8174664500409363, 0.6534818573661288, 0.7434609459640676, 0.625429283659689, 1.0]]

Built-in function max only recognizes the outermost layer. In this case, it returns the list , not numerical value.

I think using Numpy array is much more straight forward.

import numpy as np

your_original_list = [...]
your_numpy_list = np.array(your_original_list)
min_value = your_numpy_list.min()
max_value = your_numpy_list.max()

normalized = (your_numpy_list - min_value) / (max_value - min_value)

If you want to normalize the list with row-wise, you can specify axis parameter.

batch = your_numpy_list.shape[0]
min_list = your_numpy_list.min(axis=1).reshape(batch, 1)
max_list = your_numpy_list.max(axis=1).reshape(batch, 1)
normalized = (your_numpy_list - min_list) / (max_list - min_list)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM