简体   繁体   中英

How can I compare two matrixes for similarity using Python?

Python 3: How can I compare two matrices of similar shape to one another?

For example, lets say we have matrix x:

1 0 1
0 0 1 
1 1 0

I would like to compare this to matrix y:

1 0 1
0 0 1
1 1 1

Which would give me a score, for example, 8/9 as 8/9 of the items were the same, with the exception of that last digit that went from 0 to 1. The matrices I am dealing with are much larger, but their dimensions are consistent for comparison.

There must be a library of some sort that can do this. Any thoughts?

You can do this easily with numpy arrays.

import numpy as np

a = np.array([
    [1, 0, 1],
    [0, 0, 1], 
    [1, 1, 0],
])

b = np.array([
    [1, 0, 1],
    [0, 0, 1],
    [1, 1, 1],
])

print(np.sum(a == b) / a.size)

Gives back 0.889.

If your matrices are represented using the third-party library Numpy (which provides a lot of other useful stuff for dealing with matrices, as well as any kind of rectangular, multi-dimensional array):

>>> import numpy as np
>>> x = np.array([[1,0,1],[0,0,1],[1,1,0]])
>>> y = np.array([[1,0,1],[0,0,1],[1,1,1]])

Then finding the number of corresponding equal elements is as simple as:

>>> (x == y).sum() / x.size
0.8888888888888888

This works because x == y "broadcasts" the comparison to each corresponding element pair:

>>> x == y
array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True, False]])

and then we add up the boolean values (converted to integer, True has a value of 1 and False has a value of 0) and divide by the total number of elements.

If you are using NumPy you can compare them and get the following output:

import numpy as np
a = np.array([[1,0,1],[0,0,1],[1,1,0]])
b = np.array([[1,0,1],[0,0,1],[1,1,1]])

print(a == b)
Out: matrix([[ True,  True,  True],
             [ True, True,  True],
             [ True,  True,  False]],

To count the matches you can reshape the matrices to a list and count the matching values:

import numpy as np
a = np.array([[1,0,1],[0,0,1],[1,1,0]])
b = np.array([[1,0,1],[0,0,1],[1,1,1]])

res = list(np.array(a==b).reshape(-1,))
print(f'{res.count(True)}/{len(res)}')

Out: 8/9

If you are using numpy, you can simply use np.mean() on the boolean array after comparison as follows.

import numpy as np

m1 = np.array([
    [1, 0, 1],
    [0, 0, 1], 
    [1, 1, 0],
])

m2 = np.array([
    [1, 0, 1],
    [0, 0, 1],
    [1, 1, 1],
])

score = np.mean(m1 == m2)
print(score) # prints 0.888..

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM