Hypothesis Testing: Two Proportions (aka Difference of Proportions)


Introduction

Two Proportions hypothesis tests are used when... 

  • You are comparing two different populations
  • You have TWO proportions from TWO INDEPENDENT random samples

For example, as a researcher, you might want to know if there is a difference in the proportion of males who use Facebook and the proportion of females who use Facebook. A quality control specialist might also want to know if there is a difference in the percentage of defective items produced by two different machines.

A few symbols need to be defined before we dive in:

  • \(\hat{p_1}\) and \(\hat{p_2}\) refer to the sample proportions that you will use to disprove the null. 
  • \(\hat{p_c}\) refers to the combined proportion (formula down below ↓ )
  • \(\hat{q_c}\) refers to 1 minus the combined proportion, i.e. \(1 - \hat{p_c}\)
  • \(n_1\) and \(n_2\) refer to the sample sizes.

 

Example

A columnist claims that women are more safety-conscious than men when it comes to driving. A recent survey on use of seatbelts was done among a random sample of 150 men and 250 women. Based on the results, 105 men said they always wear seatbelts when driving and 186 women said the same. Using a 0.05 level of significance, do the results of the survey support the columnist’s claim?

 

Step 1: Name Test: 2-Proportions / Difference of Proportions

Step 2Define Test: 

The null hypothesis assumes that the proportions are equal  (\(H_0 : p_1 = p_2\))

With this null hypothesis, the options for the alternative hypothesis are as follows: 

Left-Sided Test Two-Sided Test Right-Sided Test

\(H_0: p_1 = p_2\)

\(H_A: p_1 < p_2\)

\(H_0: p_1 = p_2\)

\(H_A: p_1 \neq p_2\)

\(H_0: p_1 = p_2\)

\(H_A: p_1 > p_2\)

 

In this case, let's call the proportion of men who wear seatbelts \(p_M\) and the proportion of women who wear seatbelts \(p_W\). If the alternative hypothesis is that women are more safety-conscious than men, then women should have a higher seatbelt usage and \(p_W > p_M\)

\(H_0 : p_W = p_M\)

\(H_A : p_W > p_M\)

 

Step 3: Assume \(H_0\) is true and define its normal distribution. Then check the conditions.

1. The data is drawn from TWO independent random samples.

2a.  From Sample 1: \(N_1 ≥ 10n_1\)

2b.  From Sample 2: \(N_2 ≥ 10n_2\)

3a.  From Sample 1: \(n_1 \hat{p_1} ≥ 10\) and \(n_1 \hat{q_1} ≥ 10\)

3b.  From Sample 2: \(n_2 \hat{p_2} ≥ 10\) and  \(n_2 \hat{q_2} ≥ 10\)

 

Step 4: Using the normal distribution, calculate the test statistics and p-value.

Although the full formula is  \(z = {( \hat{p_1} - \hat{p_2} ) - ( p_1 - p_2) \over \sqrt { {\hat{p_c}\hat{q_c} \over n_1} + {\hat{p_c}\hat{q_c} \over n_2} } }\) , it can be simplified. Recall that the null is \(H_0 : p_1 = p_2\). Thus, \(p_1 - p_2 = 0\) . This leaves us with the formula below:  

Test Statistic (Difference of 2 Proportions):  \(z = {( \hat{p_1} - \hat{p_2} ) \over \sqrt { {\hat{p_c}\hat{q_c} \over n_1} + {\hat{p_c}\hat{q_c} \over n_2} } }\)  

Now, let's consider how to calculate the combined proportion \(\hat{p_c}\). Recall that the proportion \(\widehat{p}\) of a sample having a certain attribute is given by \(\widehat{p} = {x \over n}\) , where \(x\) is the number of elements in the sample possessing that certain attribute and \(n\) is the sample size. Thus, the combined proportion \(\hat{p_c}\) is calculated as follows: 

Combined Proportion:  \(\hat{p_c} = {x_1 + x_2 \over n_1 + n_2}\) 

Test Statistic:

\(\hat{p_W} = {186 \over 250} = 0.744\)  and  \(\hat{p_M} = {105 \over 150} = 0.70\)

\(\hat{p_c} = {186 + 105 \over 250 + 150} = 0.7275\)

→    \(z = {( 0.744 - 0.70 ) \over \sqrt { {(0.7275)(0.2725) \over 250} + {(0.7275)(0.2725) \over 150} } }\)  →  \(z = 1.045\)

P-Value:

The p-value will be found by using the normal cdf function on your calculator: 

  • lower limit: \(z\)
  • upper limit: 999
  • distribution center: 0
  • standard deviation: 1
  • All together, it looks like this: normalcdf (\(z\), 999, 0, 1)

*Note: If it was a left-sided test and the test statistic was negative (z < 0), then your lower limit would be -999 and your upper limit would be the test statistic (\(z\)). 

In this case, we do normalcdf (1.045, 999, 0, 1) to get a p-value of 0.15.

 

Step 5: Analyze your results and determine if they are statistically significant. 

We calculated a p-value of 0.15. This p-value is greater than the significance level of 0.05. Therefore, we FAIL to reject the null hypothesis. The data does NOT support the columnist’s claim that there is a difference between the proportion of men and women who always use seatbelt when driving.