STATS4STEM

Confidence Intervals: 2 Sample Mean (aka Difference of Means)

Introduction

2 Sample Mean confidence intervals are used when...

You take TWO random samples and compare their sample means.
The data from each sample is INDEPENDENT of the other.

The goal is to take two groups and compare them with respect to some outcome. For example, you might compare cholesterol levels between children and adults, or BMI between football players and soccer players.

Unlike with the Matched Pairs confidence interval, there is not a "pairing" of data. There does not have to be an equal amount of data from each sample, and the order does not matter. The two samples are completely independent, and one has no bearing on the other.

Example

You are interested in the time it takes high school boys to get ready in the morning versus the time it takes high school girls. You take a random sample of 48 boys and another random sample of 43 girls, and then record their times. You find that it takes boys 22.6 minutes on average with a standard deviation of 7.8 minutes. It takes girls 38.2 minutes on average with a standard deviation of 8.7 minutes. Construct an 90% confidence interval for the mean difference in time.

Step 1: Name the Confidence Interval: 2 Sample Mean / Difference of Means

Step 2: Check the Conditions

1. The data is drawn from TWO independent random samples.

2. Both sampling distributions are approximately normal.

Note: This can be checked with a normal probability plot.

3a. \(N_1 ≥ 10n_1\)

3b. \(N_2 ≥ 10n_2\)

Step 3: Construct the Interval (Apply the Formula)

2 Sample Mean / Difference of Means Confidence Interval Formula: \((\overline{x}_1-\overline{x}_2) \pm t^{*} \sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}\)

Although it is important to know the formula and feel comfortable using it, typically you will use your calculator to make these confidence intervals. Because there are two different sample sizes for the two separate samples, it is difficult to calculate the degrees of freedom (which is necessary for picking the correct \(t^*\) value). Luckily, there is a calculator function that will give you the degrees of freedom and the entire confidence interval!

Calculator Steps

1. Hit STAT

2. Scroll right to TESTS

3. Scroll down to 0: 2-SampTInterval

4. Input \(x_1, s_1, n_1, x_2, s_2, n_2, \) and the confidence level. Hit No for Pooled.

5. Hit Calculate, and voila!

After going through those steps, you should have found the degrees of freedom to equal 84.92 and gotten a confidence interval of (-18.49, -12.71).

Step 4: State the Conclusion

Based on the data, I am 90% confident that the mean difference in time it takes high school boys with that of high school girls (boys - girls) is between -18.49 and -12.71 minutes.