在Python中以交叉表的形式表示从多列之间的卡方检验获得的p值

Question

我的数据框中有 10 个特征。 我应用了卡方检验并为数据框中的所有列对生成了 p 值。 我想将 p 值表示为多个特征的交叉网格。

示例：A、B、C 是我的特征和 (A,B) = 0.0001、(A,C) = 0.5、(B,C) = 0.0 之间的 p 值

所以，我想把这件事看成：

      A      B       C
  A   1      0.001   0.5
  B   0.001  1       0.0
  C   0.5    0.0     1

如果需要任何其他详细信息，请告知。

Answer 1

假设您将特征列表设为features = ['A','B','C',...]并将 p 值设为
p_values = {('A','B'):0.0001,('A','C'):0.5,...}

import pandas as pd

p_values = {('A','B'):0.0001,('A','C'):0.5}
features = ['A','B','C']
df = pd.DataFrame(columns=features)

for row in features:
    rowdf = [] # prepare a row for df
    for col in features:
        if row == col:
            rowdf.append(1) # (A,A) taken as 1
            continue
        try:
            rowdf.append(p_values[(row,col)]) # add the value from dictionary
        except KeyError:
            try:
                rowdf.append(p_values[(col, row)]) # look for pair like (B,A) if (A,B) not found
            except KeyError: # still not found, append None
                rowdf.append(None)

    df.loc[len(df)] = rowdf # write row in df


df.index = features # to make row names as A,B,C ...
print(df)

在Python中以交叉表的形式表示从多列之间的卡方检验获得的p值

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-10-28 07:48:32

在Python中以交叉表的形式表示从多列之间的卡方检验获得的p值

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-10-28 07:48:32

解决方案1
0 已采纳 2020-10-28 07:48:32