Table of contents
Not having enough evidence to reject the null hypothesis doesn't mean the null hypothesis is necessarily true. Here I explain why, using an example.
Students in a certain college are more inclined to use drugs than U.S. college students in general. The proportion of drug users among collage students in general is 0.157. We take two random samples of 100 and 400 students from the collage. The proportions of drug users in both samples is 0.19 (19/100 and 76/400). Since this proportion is higher than the population proportion (0.157), can we declare that students in this collage are more inclined to use drugs?
Hypothesis testing
Step 1: State the null hypothesis (H0) and the alternative hypothesis (Ha).
Step 2: Collect relevant data from a random sample and summarize them (using a test statistic)
2.1 - Check that the conditions under which the test can be reliably used ( n*p >= 10 and n(1-p) >= 10 are met.
2.2 - Calculate the test statistic
Test statistic describes how far the observed sample proportion from the population proportion in standard deviations. It is calculated using the following formula.
Note: When we obtain a random sample of size n from a population with a population proportion p, the possible values of the sample proportion (p^), which is the sampling distribution of the proportions, is given by the mean (p) and standard deviation calculated by the following formula.
Step 3: Find the p-value, the probability of observing data like those observed assuming that Ho is true.
Step 4: Based on the p-value, decide whether we have enough evidence to reject Ho (and accept Ha), and draw our conclusions in context.
Test hypothesis as below
Step 1:
Ho - Proportion of drug users in the collage is the same as the population proportion (p = p0)
Ha - Proportion of drug users in the collage is higher than the population proportion (p > p0)
Steps 2 and 3:
Sample 1 : n = 100; mean proportion of the population (p) = 0.157; standard deviation = 0.018; observed proportion (p^) = 0.19
np = 100 0.19 = 19
n(1-p) = 100 * (1-0.19) =81
Sample 2 : n = 400; mean proportion of the population (p) = 0.157; standard deviation = 0.036; observed proportion (p^) = 0.19
np = 4000.19 = 76
n(1-p) = 400(1-0.19) = 324
Calculate the test statistic and the p-value in R as below.
> # Sample 1
> p_1 <- prop.test(x=19, n=100, p=0.157, alternative = "greater", conf.level = 0.95, correct = T)
> p_1
1-sample proportions test with continuity correction
data: 19 out of 100, null probability 0.157
X-squared = 0.59236, df = 1, p-value = 0.2208
alternative hypothesis: true p is greater than 0.157
95 percent confidence interval:
0.1297316 1.0000000
sample estimates:
p
0.19
> # Sample 2
> p_2 <- prop.test(x=76, n=400, p=0.157, alternative = "greater", conf.level = 0.95, correct = T)
> p_2
1-sample proportions test with continuity correction
data: 76 out of 400, null probability 0.157
X-squared = 3.0466, df = 1, p-value = 0.04045
alternative hypothesis: true p is greater than 0.157
95 percent confidence interval:
0.1586989 1.0000000
sample estimates:
p
0.19
According to sample 1 ( n= 100 and p-value =0.22 >0.05) , it is very likely that we get a sample of 100 students with a proportion of drug users similar to 0.157. Thus, we do not have enough evidence to reject Ho, or to state that 'proportion of drug users in the collage is higher than the population proportion'. Therefore, can we accept the null hypothesis?
With a sample of 400 students, the p-value (0.04 < 0.05) suggests that it is very unlikely that the proportion of drug users will be 0.157. Now we have enough evidence to reject Ho and state that the 'proportion of drug users in the collage is higher than the population proportion'.
Therefore, when the p-value of a sample is higher than 0.05, we never can accept Ho, but only state that we do not have enough evidence to reject Ho. It might be that the sample size was simply too small to detect a statistically significant difference, or in other words, a larger sample of same proportion can provide evidence to reject the Ho or to detect a statistically significant difference. As the sample size increases, results become more significant.