简体   繁体   中英

how to calculate pvalue for one tailed test in python?

One Population Proportion

Research Question: In previous years 52% of parents believed that electronics and social media was the cause of their teenager's lack of sleep. Do more parents today believe that their teenager's lack of sleep is caused due to electronics and social media?

Population: Parents with a teenager (age 13-18) Parameter of Interest: p Null Hypothesis: p = 0.52 Alternative Hypthesis: p > 0.52 (note that this is a one-sided test)

1018 Parents

56% believe that their teenager's lack of sleep is caused due to electronics and social media

this is a one tailed test and according to the professor the p-value should be 0.0053, but when i calculate the p-value for z-statistic=2.5545334262132955 in python:

p_value=stats.distributions.norm.cdf(1-z_statistic)

this code gives 0.06 as output

i know that stats.distributions.norm.cdf gives the probability to the left hand side of the statistic but the above code is giving wrong p value

but when I type: stats.distributions.norm.cdf(-z_statistic)

it gives output as 0.0053,

how is this possible,please help!!!

You approximate the binomial distribution with the normal since n*p > 30 and the zscore for a proportion test is:

在此处输入图像描述

So the calculation is:

import numpy as np
from scipy import stats
p0 = 0.52
p = 0.56
n = 1018
Z = (p-p0)/np.sqrt(p0*(1-p0)/n)

Z
2.5545334262132955

Your Z is correct, stats.norm.cdf(Z) gives you the cumulative probability up till Z, and since you need the probability of observing something more extreme than this, it is:

1-stats.norm.cdf(Z)
0.0053165109918223985

The probability density function of the normal distribution is symmetric, so 1-stats.norm.cdf(Z) is the same as stats.norm.cdf(-Z)

The question is formulated as a binomial problem: 1018 people take a yes/no decision with constant probability. In your case 570 out of 1018 people hold that belief and that probability is to be compared to 52 %

I do not know about Python, but I con confirm your teachers result in R:

> binom.test(570, 1018, p = .52, alternative = "greater")

    Exact binomial test

data:  570 and 1018
number of successes = 570, number of trials =
1018, p-value = 0.005843
alternative hypothesis: true probability of success is greater than 0.52
95 percent confidence interval:
 0.533735 1.000000
sample estimates:
probability of success 
             0.5599214 

The fact, that you handle z-values leads me to believe, that you had no Python problem but were using the wrong test, which is why I belief I can answer using R. You can find a binomial test impemented in Python here: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binom_test.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM