[英]How do i create a similarity matrix based on the below code?
I'm trying to use the gower function from this link https://sourceforge.net/projects/gower-distance-4python/files/ . 我正在尝试通过此链接https://sourceforge.net/projects/gower-distance-4python/files/使用gower函数。 I'm trying to apply it to my dataframe of categorical variables.
我正在尝试将其应用于分类变量数据框。 However I can see that when i use the gower_distances function i have some non-zero values in my diagonals ( i need them to all be 0).
但是我可以看到,当我使用gower_distances函数时,对角线中有一些非零值(我需要将它们全部设为0)。
I've been trying to de-bug the code. 我一直在尝试调试代码。 I think i know where this is happening and it's occuring in the _gower_distance_row function.
我想我知道这是在哪里发生的,它正在_gower_distance_row函数中发生。 There is this line of code which i don;t understand sij_cat = np.where(xi_cat == xj_cat,np.zeros_like(xi_cat),np.ones_like(xi_cat)).
这是我不了解的代码行; sij_cat = np.where(xi_cat == xj_cat,np.zeros_like(xi_cat),np.ones_like(xi_cat))。 But i will present it in a easier format to understand.
但是我将以一种更易于理解的格式呈现它。
Say i have: 说我有:
xi=np.array(['cat','dog','monkey'])
xj=np.array([['cat','dog','monkey'],['horse','dog','hairy']])
sij_cat = np.where(xi == xj,np.zeros_like(xi),np.ones_like(xi))
I get this as my result: 我得到这个作为我的结果:
array([['', '', ''],
['1', '', '1']], dtype='<U6')
since i am comparing cat with cat i want to assign zero, and where it is different eg cat vs horse and monkey vs hairy it should be 1. I don't get why in the above result i am getting ''? 因为我要比较猫和猫,所以我想指定零,并且在不同的地方(例如猫与马,猴子与毛茸茸的)应该为1。 i want zeroes here.
我想要零。 How do i fix this?
我该如何解决?
np.logical_not(xi == xj).astype(int)
output will be: 输出将是:
array([[0, 0, 0],
[1, 0, 1]])
explanation: np.logical_not
changes True
to False
and False
to True
and astype(int)
changes to 0
and 1
说明:
np.logical_not
将True
更改为False
,将False
更改为True
并且astype(int)
更改为0
和1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.