简体   繁体   中英

How can I find the value with the minimum MSE with a numpy array?

My possible values are:

0: [0 0 0 0]
1: [1 0 0 0]
2: [1 1 0 0]
3: [1 1 1 0]
4: [1 1 1 1]

I have some values:

[[0.9539342  0.84090066 0.46451256 0.09715253],
 [0.9923432  0.01231235 0.19491441 0.09715253]
 ....

I want to figure out which of my possible values this is the closest to my new values. Ideally I want to avoid doing a for loop and wonder if there's some sort of vectorized way to search for the minimum mean squared error?

I want it to return an array that looks like: [2, 1 ....

Let's assume your input data is a dictionary. You can then use NumPy for a vectorized solution. You first convert your input lists to a NumPy array and the use axis=1 argument to get the RMSE.

# Input data
dicts = {0: [0, 0, 0, 0], 1: [1, 0, 0, 0], 2: [1, 1, 0, 0], 3: [1, 1, 1, 0],4: [1, 1, 1, 1]}
new_value = np.array([0.9539342, 0.84090066, 0.46451256, 0.09715253])

# Convert values to array
values = np.array(list(dicts.values()))

# Compute the RMSE and get the index for the least RMSE 
rmse = np.mean((values-new_value)**2, axis=1)**0.5
index = np.argmin(rmse)    

print ("The closest value is %s" %(values[index]))
# The closest value is [1 1 0 0]

You can use np.argmin to get the lowest index of the rmse value which can be calculated using np.linalg.norm

import numpy as np
a = np.array([[0, 0, 0, 0], [1, 0, 0, 0], [1, 1, 0, 0],[1, 1, 1, 0], [1, 1, 1, 1]])
b = np.array([0.9539342, 0.84090066, 0.46451256, 0.09715253])
np.argmin(np.linalg.norm(a-b, axis=1))
#outputs 2 which corresponds to the value [1, 1, 0, 0]

As mentioned in the edit, b can have multiple rows. The op wants to avoid for loop, but I can't seem to find a way to avoid the for loop. Here is a list comp way, but there could be a better way

[np.argmin(np.linalg.norm(a-i, axis=1)) for i in b] 
#Outputs [2, 1]

Pure numpy:

val1 = np.array ([
   [0, 0, 0, 0],
   [1, 0, 0, 0],
   [1, 1, 0, 0],
   [1, 1, 1, 0],
   [1, 1, 1, 1]
  ])

print val1
val2 = np.array ([0.9539342, 0.84090066, 0.46451256, 0.09715253], float)
val3 = np.round(val2, 0)
print val3

print np.where((val1 == val3).all(axis=1)) # show a match on row 2 (array([2]),)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM