如何使用以下代码构建cramer-v矩阵？

Question

I'm trying to create a heatmap/correlation matrix using cramers. 我正在尝试使用cramers创建热图/相关矩阵。 I found the below code to help me with this but when using itertools.combinations it doesn't return a combination with itself eg 0,0 1,1 etc so my matrix is completely wrong since when a column is compared with itself the diagonals should be 1 but they are 0. All but 2 of my 20 columns are categorical which is why i'm using cramers 我找到了下面的代码来帮助我这个，但是当使用itertools.combinations时，它不会返回自己的组合，例如0,0 1,1等，所以我的矩阵是完全错误的，因为当列与自身比较时，对角线应该是1，但它们是0.我的20列中只有2列是绝对的，这就是我使用cramers的原因

def cramers_corrected_stat(confusion_matrix):
    """ calculate Cramers V statistic for categorical-categorical association.
        uses correction from Bergsma and Wicher, 
        Journal of the Korean Statistical Society 42 (2013): 323-328
    """
    chi2 = ss.chi2_contingency(confusion_matrix)[0]
    n = confusion_matrix.sum().sum()
    phi2 = chi2/n
    r,k = confusion_matrix.shape
    phi2corr = max(0, phi2 - ((k-1)*(r-1))/(n-1))    
    rcorr = r - ((r-1)**2)/(n-1)
    kcorr = k - ((k-1)**2)/(n-1)
    return np.sqrt(phi2corr / min( (kcorr-1), (rcorr-1))) 


cols = df.columns.to_list()
corrM = np.zeros((len(cols),len(cols)))
# there's probably a nice pandas way to do this
for col1, col2 in itertools.combinations(cols, 2):
    idx1, idx2 = cols.index(col1), cols.index(col2)
    corrM[idx1, idx2] = cramers_corrected_stat(pd.crosstab(df[col1], df[col2]))
    corrM[idx2, idx1] = corrM[idx1, idx2]

How do i fix this? 我该如何解决？

Answer 1

I wrote something that does just that: github.com/shakedzy/dython . 我写了一些东西： github.com/shakedzy/dython 。

Look for associations under nominal . 寻找nominal associations 。

如何使用以下代码构建cramer-v矩阵？

问题描述

1 个解决方案

解决方案1
0 2019-06-14 23:39:30

如何使用以下代码构建cramer-v矩阵？

问题描述

1 个解决方案

解决方案1 0 2019-06-14 23:39:30

解决方案1
0 2019-06-14 23:39:30