[英]How to apply cramer V on 2x2 matrix
I want to find the association between variables and cramer V works like a treat for matrices of sizes greater than 2X2. 我想找到变量和cramer V之间的关联,就像对待大于2X2的矩阵一样。 However, for matrices with low frequencies, it does not work well.
但是,对于低频矩阵,它不能很好地工作。 For the following contingency matrix, i get the result as 0.5.
对于以下列联矩阵,我得到的结果为0.5。 How can I account for the same?
我该如何解释?
1 2
a 2 0
b 0 2
Here is my code: 这是我的代码:
def cramers_stat(confusion_matrix):
chi2 = ss.chi2_contingency(confusion_matrix)[0]
n = confusion_matrix.sum().sum()
return np.sqrt(chi2 / (n*(min(confusion_matrix.shape)-1)))
result=cramers_stat(confusion_matrix)
print(result)
confusion_matrix is my input, in this case the matrix i mentioned above. confusion_matrix是我的输入,在这种情况下,是我上面提到的矩阵。 I understand for good results, i need a matrix frequency above 5, but for perfect association as the case above I expected the result to be 1.
我知道要获得良好的结果,我需要矩阵频率高于5,但为了获得完美的关联,如上述情况,我希望结果为1。
When you compute the Cramér coefficient, you must compute chi2 without continuity correction . 计算Cramér系数时,必须计算chi2而不进行连续性校正 。 For a 2x2 matrix,
chi2_contingency
uses continuity correction by default. 对于2x2矩阵,默认情况下
chi2_contingency
使用连续性校正。 So you must tell chi2_contingency
to not use continuity correction by giving the argument correction=False
: 因此,您必须通过给参数
chi2_contingency
correction=False
来告诉chi2_contingency
不要使用连续性校正:
chi2 = ss.chi2_contingency(confusion_matrix, correction=False)[0]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.