简体   繁体   English

Statsmodel Z 测试未按预期工作 (statsmodels.stats.weightstats.CompareMeans.ztest_ind)

[英]Statsmodel Z-test not working as intended (statsmodels.stats.weightstats.CompareMeans.ztest_ind)

Everything is formatted like on the Statsmodels website, however somehow Spyder is returning this:一切都像 Statsmodels 网站上的格式,但不知何故 Spyder 正在返回:

TypeError: ztest_ind() got multiple values for argument 'alternative' TypeError:ztest_ind() 为参数“alternative”获得了多个值

My relevant input is this (data frame is working fine):我的相关输入是这样的(数据框工作正常):

ztest = statsmodels.stats.weightstats.CompareMeans.ztest_ind(df1['TOTAL'], df2['TOTAL'], alternative = 'two-sided', usevar = 'unequal', value = 0)

I am following the formatting on this website: https://www.statsmodels.org/devel/generated/statsmodels.stats.weightstats.CompareMeans.ztest_ind.html我正在关注本网站上的格式: https://www.statsmodels.org/devel/generated/statsmodels.stats.weightstats.CompareMeans.ztest_ind.html

The api documentation is not very helpful to understand how to use this method. api 文档对了解如何使用此方法没有太大帮助。 Below is the method syntax in the documentation (link provided at the end).以下是文档中的方法语法(最后提供的链接)。

CompareMeans.ztest_ind(alternative='two-sided', usevar='pooled', value=0)
z-test for the null hypothesis of identical means

Parameters
x1array_like, 1-D or 2-D
first of the two independent samples, see notes for 2-D case

x2array_like, 1-D or 2-D
second of the two independent samples, see notes for 2-D case

At the first look, we don't see an option to pass the data values upon which we conduct the z-test.乍一看,我们没有看到传递进行 z 检验的数据值的选项。 Though 2 parameters x1 and x2 are mentioned, there are no placeholders for these in the method definition anywhere.尽管提到了 2 个参数 x1 和 x2,但在方法定义中的任何地方都没有这些参数的占位符。 It took some digging around the source code to figure out how to use it.需要对源代码进行一些挖掘才能弄清楚如何使用它。

So in the source code (link provided at the end), the method signature of ztest_ind() also outlines the parameters x1 and x2.所以在源码中(链接在文末),ztest_ind()的方法签名也概述了参数x1和x2。

def ztest_ind(self, alternative="two-sided", usevar="pooled", value=0):
        """z-test for the null hypothesis of identical means

        Parameters
        ----------
        x1 : array_like, 1-D or 2-D
            first of the two independent samples, see notes for 2-D case
        x2 : array_like, 1-D or 2-D
            second of the two independent samples, see notes for 2-D case

The biggest hint here was the 'self' argument which made it clear that the ztest_ind() method has to be invoked from a class object which has 2 array like attributes ie our 2 columns of data upon which we wish to conduct the ztest.这里最大的提示是“self”参数,它清楚地表明 ztest_ind() 方法必须从 class object 调用,它具有 2 个类似数组的属性,即我们希望对其进行 ztest 的 2 列数据。

If we take a look at the hierarchy upto ztest_ind(), we see that ztest_ind() needs to be invoked with an object reference of CompareMeans class如果我们查看直到 ztest_ind() 的层次结构,我们会看到 ztest_ind() 需要使用 CompareMeans class 的 object 引用来调用

statsmodels.stats.weightstats.CompareMeans.ztest_ind statsmodels.stats.weightstats.CompareMeans.ztest_ind

So we need to instantiate an object of CompareMeans class.所以我们需要实例化一个CompareMeans class的object。

Now if we go to the CompareMeans() class signature, it is expecting 2 parameters which in turn are instances of DescrStatsW class!现在,如果我们 go 到 CompareMeans() class 签名,它期望 2 个参数,它们又是 DescrStatsW 类的实例!

class CompareMeans(object):
    """class for two sample comparison

    The tests and the confidence interval work for multi-endpoint comparison:
    If d1 and d2 have the same number of rows, then each column of the data
    in d1 is compared with the corresponding column in d2.

    Parameters
    ----------
    d1, d2 : instances of DescrStatsW

Taking a look at the DescrStatsW class definition, we see that it is expecting a 1 or 2d array like dataset.查看 DescrStatsW class 定义,我们看到它需要一个 1 或 2 维数组,如数据集。

Finally, putting this all together we get successful run of ztest on a sample dataset as shown below!最后,将所有这些放在一起,我们可以在示例数据集上成功运行 ztest,如下所示!

  import statsmodels.stats.weightstats as ws
    
    col1 = ws.DescrStatsW(df1['amount'])
    col2 = ws.DescrStatsW(df2['amount'])
    
    cm_obj = ws.CompareMeans(col1, col2)
    
    zstat, z_pval = cm_obj.ztest_ind(usevar='unequal')
    
    print(zstat.round(3), z_pval.round(3)) # --> 2.381 0.017

documentation 文件

source code 源代码

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python 计算单样本 Z 测试的样本大小 - python compute sample size for one-sample Z-test 我使用statsmodel statsmodels.stats.outliers_influence.variance_inflation_factor对吗? - Am i using statsmodel statsmodels.stats.outliers_influence.variance_inflation_factor right? stats.ttest_ind() 与学生独立 t 检验的“手动”计算:不同的结果 - stats.ttest_ind() vs. “manual” computation of Student's independent t-test: different results 使用 python statsmodels 为 ADF 测试设置 maxlag 不起作用? - Setting maxlag for ADF test with python statsmodels not working? 如何对线性和对数线性模型应用 statsmodels.stats.diagnostic.compare_j 测试 - How to apply statsmodels.stats.diagnostic.compare_j test for linear and log-linear models 在 train_test_split 之后运行 ratios_ztest - Running a proportions_ztest after train_test_split Python 中的 IQ 测试 function 未按预期工作 - IQ test function in Python not working as intended [Statsmodels]:如何获取statsmodel以返回OLS对象的pvalue? - [Statsmodels]: How can I get statsmodel to return the pvalue of an OLS object? 在python的statsmodel包中测试和验证 - Test and Validation in statsmodel package of python 将特定的statsmodel拉到python3以获得statsmodels.tsa.johansen - pulling a particular statsmodel to python3 to get statsmodels.tsa.johansen
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM