[英]Adjacency Matrix from Numpy array using Euclidean Distance
Can someone help me please on how to generate a weighted adjacency matrix from a numpy array based on euclidean distance between all rows, ie 0 and 1, 0 and 2,.. 1 and 2,...?有人可以帮助我如何根据所有行之间的欧几里德距离(即 0 和 1、0 和 2、.. 1 和 2、...)从 numpy 数组生成加权邻接矩阵吗?
Given the following example with an input matrix(5, 4):给定以下带有输入矩阵 (5, 4) 的示例:
matrix = [[2,10,9,6],
[5,1,4,7],
[3,2,1,0],
[10, 20, 1, 4],
[17, 3, 5, 18]]
I would like to obtain a weighted adjacency matrix (5,5) containing the most minimal distance between nodes, ie,我想获得一个加权邻接矩阵(5,5),其中包含节点之间的最小距离,即
if dist(row0, row1)= 10,77 and dist(row0, row2)= 12,84,
--> the output matrix will take the first distance as a column value.
I have already solved the first part for the generation of the adjacency matrix with the following code:我已经使用以下代码解决了生成邻接矩阵的第一部分:
from scipy.spatial.distance import cdist
dist = cdist( matrix, matrix, metric='euclidean')
and I get the following result:我得到以下结果:
array([[ 0. , 10.77032961, 12.84523258, 15.23154621, 20.83266666],
[10.77032961, 0. , 7.93725393, 20.09975124, 16.43167673],
[12.84523258, 7.93725393, 0. , 19.72308292, 23.17326045],
[15.23154621, 20.09975124, 19.72308292, 0. , 23.4520788 ],
[20.83266666, 16.43167673, 23.17326045, 23.4520788 , 0. ]])
But I don't know yet how to specify the number of neighbors for which we select for example 2 neighbors for each node.但我还不知道如何指定我们 select 的邻居数量,例如每个节点有 2 个邻居。 For example, we define the number of neighbors N = 2, then for each row, we choose only two neighbors with the two minimum distances and we get as a result:
例如,我们定义邻居的数量 N = 2,然后对于每一行,我们只选择两个具有两个最小距离的邻居,我们得到结果:
[[ 0. , 10.77032961, 12.84523258, 0, 0],
[10.77032961, 0. , 7.93725393, 0, 0],
[12.84523258, 7.93725393, 0. , 0, 0],
[15.23154621, 0, 19.72308292, 0. , 0 ],
[20.83266666, 16.43167673, 0, 0 , 0. ]]
Assuming a
is your Euclidean distance matrix, you can use np.argpartition
to choose n
min/max values per row.假设
a
是您的欧几里得距离矩阵,您可以使用np.argpartition
每行选择n
最小值/最大值。 Keep in mind the diagonal is always 0 and euclidean distances are non-negative, so to keep two closest point in each row, you need to keep three min per row (including 0s on diagonal).请记住,对角线始终为 0 且欧式距离为非负数,因此要在每行中保留两个最近点,您需要每行保留 3 分钟(包括对角线上的 0)。 This does not hold if you want to do max however.
但是,如果您想做最大,这不成立。
a[np.arange(a.shape[0])[:,None],np.argpartition(a, 3, axis=1)[:,3:]] = 0
output: output:
array([[ 0. , 10.77032961, 12.84523258, 0. , 0. ],
[10.77032961, 0. , 7.93725393, 0. , 0. ],
[12.84523258, 7.93725393, 0. , 0. , 0. ],
[15.23154621, 0. , 19.72308292, 0. , 0. ],
[20.83266666, 16.43167673, 0. , 0. , 0. ]])
You can use this cleaner solution to get the smallest n from a matrix.您可以使用这个更简洁的解决方案从矩阵中获取最小的 n。 Try the following -
尝试以下 -
The dist.argsort(1).argsort(1)
creates a rank order (smallest is 0 and largest is 4) over axis=1 and the <= 2 decided the number of nsmallest values you need from the rank order. dist.argsort(1).argsort(1)
在 axis=1 上创建一个排名顺序(最小为 0,最大为 4),<= 2 决定了您从排名顺序中需要的 nsmallest 值的数量。 np.where
filters it or replaces it with 0. np.where
过滤它或用 0 替换它。
np.where(dist.argsort(1).argsort(1) <= 2, dist, 0)
array([[ 0. , 10.77032961, 12.84523258, 0. , 0. ],
[10.77032961, 0. , 7.93725393, 0. , 0. ],
[12.84523258, 7.93725393, 0. , 0. , 0. ],
[15.23154621, 0. , 19.72308292, 0. , 0. ],
[20.83266666, 16.43167673, 0. , 0. , 0. ]])
This works for any axis or if you want nlargest or nsmallest from a matrix as well.这适用于任何轴,或者如果您也想要矩阵中的 nlargest 或 nsmallest。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.