Variation and Prediction Intervals. In Exercises 17–20, find the (a) explained variation, (b) unexplained variation, and (c) indicated prediction interval. In each case, there is sufficient evidence to support a claim of a linear correlation, so it is reasonable to use the regression equation when making predictions.

Town Courts Listed below are amounts of court income and salaries paid to the town justices (based on data from the Poughkeepsie Journal). All amounts are in thousands of dollars, and all of the towns are in Dutchess County, New York. For the prediction interval, use a 99% confidence level with a court income of $800,000.

Court Income

65

404

1567

1131

272

252

111

154

32

Justice Salary

30

44

92

56

46

61

25

26

18

Short Answer

Expert verified

(a)Explained Variation:3210.364

(b) Unexplained Variation:1087.191

(c) 95% Prediction Interval:(10.4,104.6)

Step by step solution

01

Given information

Data are given fortwo variables, “Court Income” and “Justice Salary”.

02

Regression equation

Let x denote the variable “Court Income.”

Let y denote the variable “Justice Salary.”

The regression equation of y on x has the following notation:

\(\hat y = {b_0} + {b_1}x\), where

\({b_0}\)is the intercept term and\({b_1}\)is the slope coefficient.

The following calculations are done to compute the intercept and the slope coefficient:

The y-intercept is computed below:

\(\begin{array}{c}{b_0} = \frac{{\left( {\sum y } \right)\left( {\sum {{x^2}} } \right) - \left( {\sum x } \right)\left( {\sum {xy} } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ = \frac{{\left( {398} \right)\left( {4076640} \right) - \left( {3988} \right)\left( {262465} \right)}}{{9\left( {4076640} \right) - {{\left( {3988} \right)}^2}}}\\ = 27.701478\\ \approx 27.70\end{array}\)

The slope coefficient is computed below:

\(\begin{array}{c}{b_1} = \frac{{n\left( {\sum {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ = \frac{{\left( 9 \right)\left( {262465} \right) - \left( {3988} \right)\left( {398} \right)}}{{9\left( {4076640} \right) - {{\left( {3988} \right)}^2}}}\\ = 0.0372835\\ \approx 0.04\end{array}\)

Thus, the regression equation becomes as shown:

\(\begin{array}{l}\hat y = 27.701478 - 0.0372835x\\\hat y \approx 27.70 - 0.04x\end{array}\)

03

Predicted values

The mean value of observed y is computed below:

\(\begin{array}{c}\bar y = \frac{{\sum y }}{n}\\ = \frac{{398}}{9}\\ = 44.222\end{array}\)

The following table shows the predicted values (obtained by substituting the values of x in the regression equation) and other important calculations:

The value of the explained variation is shown below:

\(\sum {{{\left( {\hat y - \bar y} \right)}^2}} = 3210.364\)

Thus, the explained variation is 3210.364.

The value of the unexplained variation is shown below:

\(\sum {{{\left( {y - \hat y} \right)}^2}} = 1087.191\)

Thus, the unexplained variation is 1087.191.

04

Predicted value at \(\left( {{x_0}} \right)\)

Substitute\({x_0} = 800\)in the regression equation to obtain the predicted value.

\(\begin{array}{c}\hat y = 27.70 + 0.04x\\ = 27.70 + 0.04\left( {800} \right)\\ = 57.5283\\ \approx 58\end{array}\)

05

Formula of prediction interval

The prediction interval is obtained using the formula shown below:

\(\begin{array}{c}PI = \hat y \pm E\\ = \hat y \pm {t_{\frac{\alpha }{2}}}{s_e}\sqrt {1 + \frac{1}{n} + \frac{{n{{\left( {{x_0} - \bar x} \right)}^2}}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}} \end{array}\)

06

Degrees of freedom and critical value

The following formula is used to compute the level of significance

\(\begin{array}{c}Confidence\;Level = 99\% \\100\left( {1 - \alpha } \right) = 99\\1 - \alpha = 0.99\\ = 0.01\end{array}\)

The degrees of freedom for computing the t-multiplier are shown below:

\(\begin{array}{c}df = n - 2\\ = 9 - 2\\ = 7\end{array}\)

The two-tailed value of the t-multiplier for 0.01 level of significance and 7 degrees of freedom is 3.4995.

07

Standard error of the estimate

The standard error of the estimate is computed below:

\(\begin{array}{c}{s_e} = \sqrt {\frac{{\sum {{{\left( {y - \hat y} \right)}^2}} }}{{n - 2}}} \\ = \sqrt {\frac{{1087.191}}{{9 - 2}}} \\ = 12.46247\end{array}\)

08

Value of \(\bar x\)

The value of \(\bar x\) is computed as follows:

\(\begin{array}{c}\bar x = \frac{{\sum x }}{n}\\ = \frac{{3988}}{9}\\ = 443.111\end{array}\)

09

Prediction interval

Substitute the values obtained above to calculate the margin of error (E).

\(\begin{array}{c}E = {t_{\frac{\alpha }{2}}}{s_e}\sqrt {1 + \frac{1}{n} + \frac{{n{{\left( {{x_0} - \bar x} \right)}^2}}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}} \\ = \left( {3.4995} \right)\left( {12.46247} \right)\sqrt {1 + \frac{1}{9} + \frac{{9{{\left( {800 - 443.111} \right)}^2}}}{{9\left( {4076640} \right) - {{\left( {3988} \right)}^2}}}} \\ = 47.0986\end{array}\)

Thus, the prediction interval becomes as shown:

\(\begin{array}{c}PI = \left( {\hat y - E,\hat y + E} \right)\\ = \left( {57.5283 - 47.0986,57.5283 + 47.0986} \right)\\ = \left( {10.4297,04.6269} \right)\\ \approx \left( {10.4,104.6} \right)\end{array}\)

Therefore, the 99% prediction interval for the justice salary for thecourt income of $800,000is (10.4,104.6).

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Adjusted Coefficient of Determination For Exercise 2, why is it better to use values of adjusted \({R^2}\)instead of simply using values of \({R^2}\)?

Cigarette Tar and Nicotine The table below lists measured amounts (mg) of tar, carbonmonoxide (CO), and nicotine in king size cigarettes of different brands (from Data Set 13“Cigarette Contents” in Appendix B).

a. Is there is sufficient evidence to support a claim of a linear correlation between tar and nicotine?

b. What percentage of the variation in nicotine can be explained by the linear correlation between nicotine and tar?

c. Letting yrepresent the amount of nicotine and letting xrepresent the amount of tar, identify the regression equation.

d. The Raleigh brand king size cigarette is not included in the table, and it has 23 mg of tar. What is the best predicted amount of nicotine? How does the predicted amount compare to the actual amount of 1.3 mg of nicotine?

Tar

25

27

20

24

20

20

21

24

CO

18

16

16

16

16

16

14

17

Nicotine

1.5

1.7

1.1

1.6

1.1

1.0

1.2

1.4

Explore! Exercises 9 and 10 provide two data sets from “Graphs in Statistical Analysis,” by F. J. Anscombe, the American Statistician, Vol. 27. For each exercise,

a. Construct a scatterplot.

b. Find the value of the linear correlation coefficient r, then determine whether there is sufficient evidence to support the claim of a linear correlation between the two variables.

c. Identify the feature of the data that would be missed if part (b) was completed without constructing the scatterplot.

x

10

8

13

9

11

14

6

4

12

7

5

y

9.14

8.14

8.74

8.77

9.26

8.10

6.13

3.10

9.13

7.26

4.74

Time and Motion In a physics experiment at Doane College, a soccer ball was thrown upward from the bed of a moving truck. The table below lists the time (sec) that has lapsed from the throw and the height (m) of the soccer ball. What do you conclude about the relationship between time and height? What horrible mistake would be easy to make if the analysis is conducted without a scatterplot?

Time (sec)

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

Height (m)

0.0

1.7

3.1

3.9

4.5

4.7

4.6

4.1

3.3

2.1

Exercises 13–28 use the same data sets as Exercises 13–28 in Section 10-1. In each case, find the regression equation, letting the first variable be the predictor (x) variable. Find the indicated predicted value by following the prediction procedure summarized in Figure 10-5 on page 493.

Using the listed lemon/crash data, find the best predicted crash fatality rate for a year in which there are 500 metric tons of lemon imports. Is the prediction worthwhile?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free