简体   繁体   English

python使用wald检验来检验预测变量的意义

[英]python using wald test to test the predictors significance

I need to do logistic regression for some data, I have obtained some user features such as their post types , number of friends , number of posts , number of uploaded photos and etc, and have clustered these users into several clusters, now, I want to do wald test to test which predictors (from these user features) are significant for predicting the cluster these users belong to, using binary logistic regression, for example, for users in cluster 1, if the user belongs to cluster 1 , the cluster_label is 1 , and other users' cluster_label is zero , I need to use wald_test to choose which predictors are significant for predicting the cluster label, for example, for predicting users in cluster 1, by doing wald test, the number of friends and the number of uploaded photos have the highest wald score, therefore, these two features are significant for predicting users cluster type in cluster 1; 我需要对一些数据进行逻辑回归,我已经获得了一些用户功能,例如他们的post typesnumber of friends number of posts number of uploaded photos等,并将这些用户聚集到几个群集中,现在,我想做Wald检验测试哪个预测(从这些用户特征)是用于预测簇显著这些用户属于,使用二元逻辑回归,例如,对于在第1组的用户,如果用户属于cluster 1 ,所述cluster_label1 ,并且其他用户的cluster_label zero ,我需要使用wald_test选择哪些预测变量对于预测集群标签很重要,例如,通过进行wald test,朋友数和用户数来预测集群1中的用户上传的照片具有最高的wald评分,因此,这两个功能对于预测类别1中的用户类别类型具有重要意义; maybe for users who are in cluster 2, by doing wald test, the number of posts and the number of shared news are significant for predicting these users cluster label 也许对于集群2中的用户,通过执行wald测试,帖子的数量和共享新闻的数量对于预测这些用户的集群标签很重要

the pandas dataframe of these data are illustrated as follows, for predicting users in user cluster 1 : 这些数据的pandas数据帧如下所示,用于预测用户cluster 1中的用户:

NoPosts... Friends ...  postCluster0_ratio... postCluster4_ratio  cluster_label
 24     ...   89    ...       0.35         ...        0.3              1
 ...
 ...
 81     ...  161    ...       0.2          ...        0.15              0
 ...
 ...

when cluster_label is 1, it means that this user belongs to user cluster 1, when cluster_label is 0, it means that this user does not belong to cluster 1,and I'd like to by doing wald test to decide which predictors(from NoPosts, Frineds...postcluster0_ratio...) are significant for predicting users cluater label, but from 当cluster_label为1时,表示该用户属于用户集群1;当cluster_label为0时,表示该用户不属于集群1;我想通过wald test来确定哪些预测变量(来自NoPosts ,Frineds ... postcluster0_ratio ...)对于预测用户线索标签很重要,但是从

http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLSResults.wald_test.html#statsmodels.regression.linear_model.OLSResults.wald_test there are no examples for wald_test in python, I do not know what is the required input for wald_test, how to fit, in one word, I do not know how to use wald_test for my case, could you please help me how to use wald_test, it is better for providing me the code http://www.statsmodels.org/dev/generation/statsmodels.regression.linear_model.OLSResults.wald_test.html#statsmodels.regression.linear_model.OLSResults.wald_test在python中没有wald_test的示例,我不知道这是什么wald_test的必需输入,如何拟合,总之,我不知道如何在我的情况下使用wald_test,能否请您帮我如何使用wald_test,最好为我提供代码

For individual tests (not joint hypothesis) you can use the t_test which is a special case of the Wald test http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLSResults.t_test.html 对于单个测试(不是联合假设),您可以使用t_test,这是Wald测试的特例http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLSResults.t_test.html

The pvalues for the tests whether the parameters are statistically significant from zero are in summary() and are precomputed, see pvalues in http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.html 用于测试参数是否从零开始具有统计显着性的p值位于summary()并且已预先计算,请参见http://www.statsmodels.org/dev/genic/statsmodels.regression.linear_model.RegressionResults.html中的 pvalues

wald_test is the more general version of the f_test for joint hypothesis which has some examples that work in the same way for wald_test http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLSResults.f_test.html wald_test是用于联合假设的f_test的更通用版本,其中的一些示例对于wald_test的工作方式相同

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM