STATS4STEM

Hypothesis Testing: Two Proportions (aka Difference of Proportions)

Introduction

Two Proportions hypothesis tests are used when...

You are comparing two different populations
You have TWO proportions from TWO INDEPENDENT random samples

For example, as a researcher, you might want to know if there is a difference in the proportion of males who use Facebook and the proportion of females who use Facebook. A quality control specialist might also want to know if there is a difference in the percentage of defective items produced by two different machines.

A few symbols need to be defined before we dive in:

\(\hat{p_1}\) and \(\hat{p_2}\) refer to the sample proportions that you will use to disprove the null.
\(\hat{p_c}\) refers to the combined proportion (formula down below ↓ )
\(\hat{q_c}\) refers to 1 minus the combined proportion, i.e. \(1 - \hat{p_c}\)
\(n_1\) and \(n_2\) refer to the sample sizes.

Example

A columnist claims that women are more safety-conscious than men when it comes to driving. A recent survey on use of seatbelts was done among a random sample of 150 men and 250 women. Based on the results, 105 men said they always wear seatbelts when driving and 186 women said the same. Using a 0.05 level of significance, do the results of the survey support the columnist’s claim?

Step 1: Name Test: 2-Proportions / Difference of Proportions

Step 2: Define Test:

The null hypothesis assumes that the proportions are equal (\(H_0 : p_1 = p_2\))

With this null hypothesis, the options for the alternative hypothesis are as follows:

Left-Sided Test

Two-Sided Test

Right-Sided Test

\(H_0: p_1 = p_2\)

\(H_A: p_1 < p_2\)

\(H_0: p_1 = p_2\)

\(H_A: p_1 \neq p_2\)

\(H_0: p_1 = p_2\)

\(H_A: p_1 > p_2\)

In this case, let's call the proportion of men who wear seatbelts \(p_M\) and the proportion of women who wear seatbelts \(p_W\). If the alternative hypothesis is that women are more safety-conscious than men, then women should have a higher seatbelt usage and \(p_W > p_M\).

\(H_0 : p_W = p_M\)

\(H_A : p_W > p_M\)

Step 3: Assume \(H_0\) is true and define its normal distribution. Then check the conditions.

1. The data is drawn from TWO independent random samples.

2a. From Sample 1: \(N_1 ≥ 10n_1\)

2b. From Sample 2: \(N_2 ≥ 10n_2\)

3a. From Sample 1: \(n_1 \hat{p_1} ≥ 10\) and \(n_1 \hat{q_1} ≥ 10\)

3b. From Sample 2: \(n_2 \hat{p_2} ≥ 10\) and \(n_2 \hat{q_2} ≥ 10\)

Step 4: Using the normal distribution, calculate the test statistics and p-value.

Although the full formula is \(z = {( \hat{p_1} - \hat{p_2} ) - ( p_1 - p_2) \over \sqrt { {\hat{p_c}\hat{q_c} \over n_1} + {\hat{p_c}\hat{q_c} \over n_2} } }\) , it can be simplified. Recall that the null is \(H_0 : p_1 = p_2\). Thus, \(p_1 - p_2 = 0\) . This leaves us with the formula below:

Test Statistic (Difference of 2 Proportions): \(z = {( \hat{p_1} - \hat{p_2} ) \over \sqrt { {\hat{p_c}\hat{q_c} \over n_1} + {\hat{p_c}\hat{q_c} \over n_2} } }\)

Now, let's consider how to calculate the combined proportion \(\hat{p_c}\). Recall that the proportion \(\widehat{p}\) of a sample having a certain attribute is given by \(\widehat{p} = {x \over n}\) , where \(x\) is the number of elements in the sample possessing that certain attribute and \(n\) is the sample size. Thus, the combined proportion \(\hat{p_c}\) is calculated as follows:

Combined Proportion: \(\hat{p_c} = {x_1 + x_2 \over n_1 + n_2}\)

Test Statistic:

\(\hat{p_W} = {186 \over 250} = 0.744\) and \(\hat{p_M} = {105 \over 150} = 0.70\)

\(\hat{p_c} = {186 + 105 \over 250 + 150} = 0.7275\)

→ \(z = {( 0.744 - 0.70 ) \over \sqrt { {(0.7275)(0.2725) \over 250} + {(0.7275)(0.2725) \over 150} } }\) → \(z = 1.045\)

P-Value:

The p-value will be found by using the normal cdf function on your calculator:

lower limit: \(z\)
upper limit: 999
distribution center: 0
standard deviation: 1
All together, it looks like this: normalcdf (\(z\), 999, 0, 1)

*Note: If it was a left-sided test and the test statistic was negative (z < 0), then your lower limit would be -999 and your upper limit would be the test statistic (\(z\)).

In this case, we do normalcdf (1.045, 999, 0, 1) to get a p-value of 0.15.

Step 5: Analyze your results and determine if they are statistically significant.

We calculated a p-value of 0.15. This p-value is greater than the significance level of 0.05. Therefore, we FAIL to reject the null hypothesis. The data does NOT support the columnist’s claim that there is a difference between the proportion of men and women who always use seatbelt when driving.