Chapter 3: Q45 (page 112)

Why Divide by n − 1? Let a population consist of the values 9 cigarettes, 10 cigarettes, and 20 cigarettes smoked in a day (based on data from the California Health Interview Survey). Assume that samples of two values are randomly selected with replacement from this population. (That is, a selected value is replaced before the second selection is made.)
a. Find the variance $σ^{2}$ of the population {9 cigarettes, 10 cigarettes, 20 cigarettes}.
b. After listing the nine different possible samples of two values selected with replacement, find the sample variance $s^{2}$ (which includes division by n - 1) for each of them; then find the mean of the nine sample variances $s^{2}$ .
c. For each of the nine different possible samples of two values selected with replacement, find the variance by treating each sample as if it is a population (using the formula for population variance, which includes division by n); then find the mean of those nine population variances.
d. Which approach results in values that are better estimates of $σ^{2}$ part (b) or part (c)? Why? When computing variances of samples, should you use division by n or n - 1?
e. The preceding parts show that $s^{2}$ is an unbiased estimator of $σ^{2}$ . Is s an unbiased estimator of $σ$ ? Explain

Short Answer

Expert verified

(a) Population variance $(σ^{2})$ is equal to 24.7.

(b) ${s_{1}}^{2}$ = 0.5, ${s_{2}}^{2}$ = 60.5, ${s_{3}}^{2}$ = 50.0, ${s_{4}}^{2}$ = 0.0, ${s_{5}}^{2}$ = 0.0, ${s_{6}}^{2}$ = 0.0, ${s_{7}}^{2}$ = 0.5, ${s_{8}}^{2}$ = 60.5, and ${s_{9}}^{2}$ = 50.0. The mean of the 9-sample variances is 24.7.

(c) ${σ_{1}}^{2}$ = 0.25, ${σ_{2}}^{2}$ = 30.25, ${σ_{3}}^{2}$ = 25.0, ${σ_{4}}^{2}$ = 0.0, ${σ_{5}}^{2}$ = 0.0, ${σ_{6}}^{2}$ = 0.0, ${σ_{7}}^{2}$ = 0.25, ${σ_{8}}^{2}$ = 30.25, and ${σ_{9}}^{2}$ = 25.0. The mean of the 9-population variances is 12.3.

(d) The method in part (b) results in a better estimate as multiple samples are used to compute the mean of the sample variances. Thus, the value becomes equal to the population variance. Moreover, using n–1 gives a precise estimate.

(e) No, s is not an unbiased estimator of $σ$ as the mean of the sample standard deviations is not equal to the population standard deviation.

Step by step solution

Given information

A population of three values (number of cigarettes) is given.

Out of these, nine samples are selected with replacement.

Population variance and sample variance

Population variance $(σ^{2})$ is calculated by dividing the sum of the squared differences of the population observations (from the mean) by the count of observations.

Mathematically,

$σ^{2} = \frac{\sum_{i = 1}^{n} {(x_{i} - μ)}^{2}}{n}$

Here, n is the total number of observations.

Sample variance $(s^{2})$ is calculated by dividing the sum of the squared differences of the sample observations from the mean by n–1.

Mathematically,

$s^{2} = \frac{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}{n - 1}$

Compute the population variance

(a)

To compute the value of the population variance, find the population mean $(μ)$ as shown below.

$μ = \frac{9 + 10 + 20}{3} = 13.0$

The population variance is computed as follows:

$σ^{2} = \frac{\sum_{i = 1}^{n} {(x_{i} - μ)}^{2}}{n} = \frac{{(9 - 13.0)}^{2} + {(10 - 13.0)}^{2} + {(20 - 13.0)}^{2}}{3} = 24.7$

The population variance is 24.7.

Describe the feasible samples of size two from the collection

(b)

The nine different samples selected with replacement are shown below:

Sample 1	Sample 2	Sample 3
9	9	10
10	20	20

Sample 4	Sample 5	Sample 6
9	10	20
9	10	20

Sample 7	Sample 8	Sample 9
10	20	20
9	9	10

The mean of each sample is computed using the formula $\bar{x} = \frac{\sum x}{n}$ .

The mean for each sample is stated in the brackets in the following table.

Sample 1	Sample 2	Sample 3
9	9	10
10	20	20
$({\bar{x}}_{1} = 9.5)$	$({\bar{x}}_{2} = 14.5)$	$({\bar{x}}_{3} = 15)$
Sample 4	Sample 5	Sample 6
9	10	20
9	10	20
$({\bar{x}}_{4} = 9)$	$({\bar{x}}_{5} = 10)$	$({\bar{x}}_{6} = 20)$
Sample 7	Sample 8	Sample 9
10	20	20
9	9	10
$({\bar{x}}_{7} = 9.5)$	$({\bar{x}}_{8} = 14.5)$	$({\bar{x}}_{9} = 15)$

The sample variances are computed as shown below.

${s_{1}}^{2} = \frac{{(9 - 9.5)}^{2} + {(10 - 9.5)}^{2}}{2 - 1} = 0.5 {s_{2}}^{2} = \frac{{(9 - 14.5)}^{2} + {(20 - 14.5)}^{2}}{2 - 1} = 60.5$

${s_{3}}^{2} = \frac{{(10 - 15)}^{2} + {(20 - 15)}^{2}}{2 - 1} = 50.0 {s_{4}}^{2} = \frac{{(9 - 9)}^{2} + {(9 - 9)}^{2}}{2 - 1} = 0.0$

${s_{5}}^{2} = \frac{{(10 - 10)}^{2} + {(10 - 10)}^{2}}{2 - 1} = 0.0 {s_{6}}^{2} = \frac{{(20 - 20)}^{2} + {(20 - 20)}^{2}}{2 - 1} = 0.0$

${s_{7}}^{2} = \frac{{(10 - 9.5)}^{2} + {(9 - 9.5)}^{2}}{2 - 1} = 0.5 {s_{8}}^{2} = \frac{{(20 - 14.5)}^{2} + {(9 - 14.5)}^{2}}{2 - 1} = 60.5$

${s_{9}}^{2} = \frac{{(20 - 15)}^{2} + {(10 - 15)}^{2}}{2 - 1} = 50.0$

The mean of the nine sample variances is

${\bar{s}}^{2} = \frac{\sum_{i = 1}^{9} {s_{i}}^{2}}{9} = 24.7$

Thus, the mean of the sample variances is 24.7.

Describe the variances for each sample using the population variance

(c)

The variance of samples is computed using the formula for population variance, as shown below.

Considering the above nine samples as populations, you can compute the population variances as shown below.

${σ_{1}}^{2} = \frac{{(9 - 9.5)}^{2} + {(10 - 9.5)}^{2}}{2} = 0.25 {σ_{2}}^{2} = \frac{{(9 - 14.5)}^{2} + {(20 - 14.5)}^{2}}{2} = 30.25$

${σ_{3}}^{2} = \frac{{(10 - 15)}^{2} + {(20 - 15)}^{2}}{2} = 25.0 {σ_{4}}^{2} = \frac{{(9 - 9)}^{2} + {(9 - 9)}^{2}}{2} = 0.0$

${σ_{5}}^{2} = \frac{{(10 - 10)}^{2} + {(10 - 10)}^{2}}{2} = 0.0 {σ_{6}}^{2} = \frac{{(20 - 20)}^{2} + {(20 - 20)}^{2}}{2} = 0.0$

${σ_{7}}^{2} = \frac{{(10 - 9.5)}^{2} + {(9 - 9.5)}^{2}}{2} = 0.25 {σ_{8}}^{2} = \frac{{(20 - 14.5)}^{2} + {(9 - 14.5)}^{2}}{2} = 30.25$

${σ_{9}}^{2} = \frac{{(20 - 15)}^{2} + {(10 - 15)}^{2}}{2} = 25.0$

The mean of the nine population variances is

${\bar{σ}}^{2} = \frac{\sum_{i = 1}^{9} {σ_{i}}^{2}}{9} = 12.3 .$

Thus, the mean of the population variances is 12.3.

Compare the results of parts (b) and (c)

(d)

Part (b) gives a better estimate. By usingn–1for sample variance, the value gives a precise estimate of the population variance.

Here, a repeated number of samples tends tocenterthe value of the resultant variance close to the population variance. In the case of sample variance,division by n–1 is performed rather than by n. If divided by n, the value of the sample variance underestimates the value of population variance.

Explain if the sample standard deviation is an unbiased estimator of the population standard deviation

(e)

An unbiased estimate is a measure for sample values that have a mean equivalent or are close to the population value of the measure.

The standard deviations for the nine samples are calculated below:

$s_{1} = \sqrt{{s_{1}}^{2}} = 0.7 s_{2} = \sqrt{{s_{2}}^{2}} = 7.8$

$s_{3} = \sqrt{{s_{3}}^{2}} = 7.1 s_{4} = \sqrt{{s_{4}}^{2}} = 0.0$

$s_{5} = \sqrt{{s_{5}}^{2}} = 0.0 s_{6} = \sqrt{{s_{6}}^{2}} = 0.0$

$s_{7} = \sqrt{{s_{7}}^{2}} = 0.7 s_{8} = \sqrt{{s_{8}}^{2}} = 7.8$

$s_{9} = \sqrt{{s_{9}}^{2}} = 7.1$

The mean of these nine sample standard deviations is

$\bar{s} = \frac{\sum_{i = 1}^{9} s_{i}}{9} = 3.5 .$

Therefore, the mean of the sample standard deviations is 3.5.

The population standard deviation is

$σ = \sqrt{σ^{2}} = 5.0 .$

Thus, the value of the population standard deviation is 5.0.

Here, the mean of the sample standard deviations is not equal to the population standard deviation.

Therefore, the sample standard deviation $(s)$ is not an unbiased estimator of the population standard deviation $(σ)$ .

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Start your free trial

Over 30 million students worldwide already upgrade their learning with Vaia!

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Blood Platelet Count of Males	Frequency
0-99	1
100-199	51
200-299	90
300-399	10
400-499	0
500-599	0
600-699	1

Short Answer

Step by step solution

Given information

Population variance and sample variance

Compute the population variance

Describe the feasible samples of size two from the collection

Describe the variances for each sample using the population variance

Compare the results of parts (b) and (c)

Explain if the sample standard deviation is an unbiased estimator of the population standard deviation

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Math Textbooks

Calculus

Probability and Statistics

Pure Maths

Statistics

Geometry

Decision Maths

Study anywhere. Anytime. Across all devices.