Regression and Predictions. Exercises 13–28 use the same data sets as Exercises 13–28 in Section 10-1. In each case, find the regression equation, letting the first variable be the predictor (x) variable. Find the indicated predicted value by following the prediction procedure summarized in Figure 10-5 on page 493.

Using the president/opponent heights, find the best predicted height of an opponent of a president who is 190 cm tall. Does it appear that heights of opponents can be predicted from the heights of the presidents?

President

178

182

188

175

179

183

192

182

177

185

188

188

183

Opponent

180

180

182

173

178

182

180

180

183

177

173

188

185

Short Answer

Expert verified

The regression equation is\(\hat y = 161.9 + 0.097x\).

Thebest predicted height of an opponent of a president who is 190 cm tall is 180 cm.

No, the heights of opponents cannot be predicted using the heights of presidents.

Step by step solution

01

Given information

Values are given on two variables namely, the president’s height and the opponent’s height.

02

Calculate the mean values

Let x represent thepresident’s height.

Let y represent theopponent’s height.

Themean value of xis given as,

\(\begin{array}{c}\bar x = \frac{{\sum\limits_{i = 1}^n {{x_i}} }}{n}\\ = \frac{{178 + 182 + .... + 188}}{{14}}\\ = 183.429\end{array}\)

Therefore, the mean value of x is 183.429.

Themean value of yis given as,

\(\begin{array}{c}\bar y = \frac{{\sum\limits_{i = 1}^n {{y_i}} }}{n}\\ = \frac{{180 + 180 + .... + 175}}{{14}}\\ = 179.714\end{array}\)

Therefore, the mean value of y is 179.714.

03

Calculate the standard deviation of x and y

The standard deviation of x is given as,

\(\begin{array}{c}{s_x} = \sqrt {\frac{{\sum\limits_{i = 1}^n {{{({x_i} - \bar x)}^2}} }}{{n - 1}}} \\ = \sqrt {\frac{{{{\left( {178 - 183.429} \right)}^2} + {{\left( {182 - 183.429} \right)}^2} + ..... + {{\left( {188 - 183.429} \right)}^2}}}{{14 - 1}}} \\ = 5.003\end{array}\)

Therefore, the standard deviation of x is 5.003.

The standard deviation of y is given as,

\(\begin{array}{c}{s_y} = \sqrt {\frac{{\sum\limits_{i = 1}^n {{{({y_i} - \bar y)}^2}} }}{{n - 1}}} \\ = \sqrt {\frac{{{{\left( {180 - 179.714} \right)}^2} + {{\left( {180 - 179.714} \right)}^2} + ..... + {{\left( {175 - 179.714} \right)}^2}}}{{14 - 1}}} \\ = 4.304\end{array}\)

Therefore, the standard deviation of y is 4.304.

04

Calculate the correlation coefficient

Thecorrelation coefficient is given as,

\(r = \frac{{n\left( {\sum {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{\sqrt {\left( {\left( {n\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}} \right)\left( {\left( {n\sum {{y^2}} } \right) - {{\left( {\sum y } \right)}^2}} \right)} }}\)

The calculations required to compute the correlation coefficient are as follows:

The correlation coefficient is given as,

\(\begin{array}{c}r = \frac{{n\left( {\sum {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{\sqrt {\left( {\left( {n\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}} \right)\left( {\left( {n\sum {{y^2}} } \right) - {{\left( {\sum y } \right)}^2}} \right)} }}\\ = \frac{{14\left( {461538} \right) - \left( {2568} \right)\left( {2516} \right)}}{{\sqrt {\left( {\left( {14 \times 471370} \right) - {{\left( {2568} \right)}^2}} \right)\left( {\left( {14 \times 452402} \right) - {{\left( {2516} \right)}^2}} \right)} }}\\ = 0.1133\end{array}\)

Therefore, the correlation coefficient is 0.1133.

05

Calculate the slope of the regression line

The slopeof the regression line is given as,

\(\begin{array}{c}{b_1} = r\frac{{{s_Y}}}{{{s_X}}}\\ = 0.1133 \times \frac{{4.304}}{{5.003}}\\ = 0.097\end{array}\)

Therefore, the value of slope is 0.097.

06

Calculate the intercept of the regression line

The interceptis computed as,

\(\begin{array}{c}{b_0} = \bar y - {b_1}\bar x\\ = 179.714 - \left( {0.1133 \times 183.429} \right)\\ = 161.838\end{array}\)

Therefore, the value of intercept is 161.838.

07

Form a regression equation

Theregression equationis given as,

\(\begin{array}{c}\hat y = {b_0} + {b_1}x\\ = 161.838 + 0.097x\end{array}\)

Thus, the regression equation is \(\hat y = 161.838 + 0.097x\).

08

Analyze the regression model

Referring to exercise 26 of section 10-1,

1)The scatter plot does not show a linear relationship between the variables.

2)The P-value is 0.700.

As the P-value is greater than the level of significance (0.05), this implies the null hypothesis fails to reject.

Therefore, the correlation is not significant.

Referring to figure 10-5,the criteria for a good regression model are not satisfied.

Therefore, the regression equation cannot be used to predict the value of y.

The best predicted height of an opponent of a president who is 190 cm tall is computed as,

\(\begin{array}{c}\hat y = \bar y\\ = 179.71\end{array}\)

Therefore, the best predicted height of an opponent of a president who is 190 cm tall is 180 cm.

09

Discuss the prediction methods

No, the heights of opponents cannot be predicted from the heights of the presidents using the regression equation as the regression model is not good due to insignificant correlation.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

In Exercises 9–12, refer to the accompanying table, which was obtained using the data from 21 cars listed in Data Set 20 “Car Measurements” in Appendix B. The response (y) variable is CITY (fuel consumption in mi , gal). The predictor (x) variables are WT (weight in pounds), DISP (engine displacement in liters), and HWY (highway fuel consumption in mi , gal).

If exactly two predictor (x) variables are to be used to predict the city fuel consumption, which two variables should be chosen? Why?

Interpreting a Computer Display. In Exercises 9–12, refer to the display obtained by using the paired data consisting of Florida registered boats (tens of thousands) and numbers of manatee deaths from encounters with boats in Florida for different recent years (from Data Set 10 in Appendix B). Along with the paired boat, manatee sample data, StatCrunch was also given the value of 85 (tens of thousands) boats to be used for predicting manatee fatalities.

Finding a Prediction Interval For a year with 850,000 (x = 852) registered boats in Florida, identify the 95% prediction interval estimate of the number of manatee fatalities resulting from encounters with boats. Write a statement interpreting that interval.

Testing for a Linear Correlation. In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of A = 0.05. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.)

Manatees Listed below are numbers of registered pleasure boats in Florida (tens of thousands) and the numbers of manatee fatalities from encounters with boats in Florida for each of several recent years. The values are from Data Set 10 “Manatee Deaths” in Appendix B. Is there sufficient evidence to conclude that there is a linear correlation between numbers of registered pleasure boats and numbers of manatee boat fatalities?

Pleasure Boats

99

99

97

95

90

90

87

90

90

Manatee Fatalities

92

73

90

97

83

88

81

73

68

Ages of Moviegoers Based on the data from Cumulative Review Exercise 7, assume that ages of moviegoers are normally distributed with a mean of 35 years and a standard deviation of 20 years.

a. What is the percentage of moviegoers who are younger than 30 years of age?

b. Find\({P_{25}}\), which is the 25th percentile.

c. Find the probability that a simple random sample of 25 moviegoers has a mean age that is less than 30 years.

d. Find the probability that for a simple random sample of 25 moviegoers, each of the moviegoers is younger than 30 years of age. For a particular movie and showtime, why might it not be unusual to have 25 moviegoers all under the age of 30?

Exercises 13–28 use the same data sets as Exercises 13–28 in Section 10-1. In each case, find the regression equation, letting the first variable be the predictor (x) variable. Find the indicated predicted value by following the prediction procedure summarized in Figure 10-5 on page 493.

Use the shoe print lengths and heights to find the best predicted height of a male who has a shoe print length of 31.3 cm. Would the result be helpful to police crime scene investigators in trying to describe the male?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free