Python：卡方列联检验（如何解释）

Question

I've done practicing a chi-squared contingency test as below but i'm having a problem on how to interpret the result.我已经完成了如下的卡方应变测试，但我在如何解释结果方面遇到了问题。 The result of below test says p-val = 0. So does it means that two variables are not independent??下面测试的结果是 p-val = 0。那么这是否意味着两个变量不独立？ As it's a small data, I thought it's pretty sure that the variables are independent.由于这是一个小数据，我认为变量是独立的。 And it seems weird the p-val is 0. Did I do something wrong?? p-val 为 0 似乎很奇怪。我做错了吗？

import pandas as pd
df = pd.DataFrame({
    "~60m2" : [54, 577, 143, 782],
    "60~85m2" : [2, 735, 1437, 1],
    "85m2~" : [0, 142, 44, 0],
    })
df.index = ["A", "B", "C", "D"]
df.columns.names = ["size"]
df.index.names = ["city"]

from scipy import stats
stats.chi2_contingency(df)

the output output

(2064.576731417199,
 0.0,
 6,
 array([[ 22.24559612,  31.09522594,   2.65917794],
        [577.59101353, 807.36533061,  69.04365586],
        [645.12228746, 901.76155221,  77.11616033],
        [311.04110288, 434.77789124,  37.18100587]]))

Answer 1

I think it is correct.我认为这是正确的。 Your cities are very different.你们的城市非常不同。 Just try to normalize by row:只需尝试按行标准化：

(df.T / df.sum(axis=1)).T                                             

size     ~60m2   60~85m2     85m2~
city                              
A     0.964286  0.035714  0.000000
B     0.396836  0.505502  0.097662
C     0.088054  0.884852  0.027094
D     0.998723  0.001277  0.000000

each row is very different from the others, so yes cities seems to be different, ie sampled from different population.每一行都与其他行非常不同，所以是的，城市似乎是不同的，即从不同的人口中抽样。

Python：卡方列联检验（如何解释）

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-05-14 12:52:12

Python：卡方列联检验（如何解释）

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-05-14 12:52:12

解决方案1
1 已采纳 2021-05-14 12:52:12