According to the least-squares property, the regression line minimizes the sum of the squares of the residuals. Refer to the data in table 10-1 on page 469.

a. Find the sum of squares of the residuals.

b. Show that the regression equation\(\hat y = - 3 + 2.5x\)results in a larger sum of squares of residuals.

Short Answer

Expert verified

a. The sum of residuals from the best-fit regression line is 823.64.

b. The sum of squares of the residuals is 827.45 from the given equation that is greater than the sum of squares of the residuals of regression line obtained from the best-fit regression line(823.64).

Step by step solution

01

Given information

Values are given for two variables, namely, Chocolate and Nobel.

02

Calculate the mean values

Let x representChocolate.

Let y representNobel.

Themean value of xis given below:

\(\begin{array}{c}\bar x = \frac{{\sum\limits_{i = 1}^n {{x_i}} }}{n}\\ = \frac{{4.5 + 10.2 + .... + 5.3}}{{23}}\\ = 5.80435\end{array}\)

Therefore, the mean value of x is 5.80435.

Themean value of yis given below:

\(\begin{array}{c}\bar y = \frac{{\sum\limits_{i = 1}^n {{y_i}} }}{n}\\ = \frac{{5.5 + 24.3 + .... + 10.8}}{{23}}\\ = 11.10435\end{array}\)

Therefore, the mean value of y is 11.10435.

03

Calculate the standard deviation of x and y

The standard deviation of x is given below:

\(\begin{array}{c}{s_x} = \sqrt {\frac{{\sum\limits_{i = 1}^n {{{({x_i} - \bar x)}^2}} }}{{n - 1}}} \\ = \sqrt {\frac{{{{\left( {4.5 - 5.80435} \right)}^2} + {{\left( {10.2 - 5.80435} \right)}^2} + ... + {{\left( {5.3 - 5.80435} \right)}^2}}}{{23 - 1}}} \\ = 3.27920\end{array}\)

Therefore, the standard deviation of x is 3.27920.

The standard deviation of yis given below:

\(\begin{array}{c}{s_y} = \sqrt {\frac{{\sum\limits_{i = 1}^n {{{({y_i} - \bar y)}^2}} }}{{n - 1}}} \\ = \sqrt {\frac{{{{\left( {5.5 - 11.10435} \right)}^2} + {{\left( {24.3 - 11.10435} \right)}^2} + ..... + {{\left( {10.8 - 11.10435} \right)}^2}}}{{23 - 1}}} \\ = 10.2116\end{array}\)

Therefore, the standard deviation of y is 10.2116.

04

Calculate the correlation coefficient

The correlation coefficient is given below:

\(r = \frac{{n\left( {\su

m {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{\sqrt {\left( {\left( {n\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}} \right)\left( {\left( {n\sum {{y^2}} } \right) - {{\left( {\sum y } \right)}^2}} \right)} }}\)

The calculations required to compute the correlation coefficient are as follows

The correlation coefficient is given below:

\(\begin{array}{c}r = \frac{{n\left( {\sum {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{\sqrt {\left( {\left( {n\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}} \right)\left( {\left( {n\sum {{y^2}} } \right) - {{\left( {\sum y } \right)}^2}} \right)} }}\\ = \frac{{23\left( {2072.23} \right) - \left( {133.5} \right)\left( {255.4} \right)}}{{\sqrt {\left( {\left( {23 \times 1011.45} \right) - {{\left( {133.5} \right)}^2}} \right)\left( {\left( {23 \times 5130.14} \right) - {{\left( {255.4} \right)}^2}} \right)} }}\\ = 0.80061\end{array}\)

Therefore, the correlation coefficient is 0.80061.

05

Calculate the slope of the regression line

The slope of the regression line is given below:

\(\begin{array}{c}{b_1} = r\frac{{{s_Y}}}{{{s_X}}}\\ = 0.80061 \times \frac{{10.2116}}{{3.27920}}\\ = 2.49313\\ \approx 2.50\end{array}\)

Therefore, the value of the slope is 2.50.

06

Calculate the intercept of the regression line

The intercept is computed below:

\(\begin{array}{c}{b_0} = \bar y - {b_1}\bar x\\ = 11.10435 - \left( {2.50 \times 5.80435} \right)\\ = - 3.37\end{array}\)

Therefore, the value of the intercept is –3.37.

07

Form a regression equation

Theregression equationis givenbelow:

\(\begin{array}{c}\hat y = {b_0} + {b_1}x\\ = - 3.37 + 2.50x\end{array}\)

Thus,the best-fit regression equation is\(\hat y = - 3.37 + 2.50x\).

08

Compute the residuals

a.

The residual is computedbelow:

\(\begin{array}{c}{\mathop{\rm Residual}\nolimits} = {\rm{observed}}\;{\rm{y}} - \;{\rm{predicted}}\;{\rm{y}}\\ = y - \hat y\end{array}\)

The calculations are as follows:

09

Compute the sum of squares of residuals

b.

The calculations are as follows:

The sum of squares of residuals is computed below:

\(7.56 + 3.24 + 0.36 + ... + 0.30 = 827.45\)

Therefore, the sum of squares of residual is 827.45.

The sum of squares of residuals from the given regression line is 827.45 which is larger than the sum of squares of the residuals obtained from best-fit regression line 823.64.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Interpreting a Computer Display. In Exercises 9–12, refer to the display obtained by using the paired data consisting of Florida registered boats (tens of thousands) and numbers of manatee deaths from encounters with boats in Florida for different recent years (from Data Set 10 in Appendix B). Along with the paired boat, manatee sample data, StatCrunch was also given the value of 85 (tens of thousands) boats to be used for predicting manatee fatalities.

Testing for Correlation Use the information provided in the display to determine the value of the linear correlation coefficient. Is there sufficient evidence to support a claim of a linear correlation between numbers of registered boats and numbers of manatee deaths from encounters with boats?

Cigarette Nicotine and Carbon Monoxide Refer to the table of data given in Exercise 1 and use the amounts of nicotine and carbon monoxide (CO).

a. Construct a scatterplot using nicotine for the xscale, or horizontal axis. What does the scatterplot suggest about a linear correlation between amounts of nicotine and carbon monoxide?

b. Find the value of the linear correlation coefficient and determine whether there is sufficient evidence to support a claim of a linear correlation between amounts of nicotine and carbon monoxide.

c. Letting yrepresent the amount of carbon monoxide and letting xrepresent the amount of nicotine, find the regression equation.

d. The Raleigh brand king size cigarette is not included in the table, and it has 1.3 mg of nicotine. What is the best predicted amount of carbon monoxide?

Tar

25

27

20

24

20

20

21

24

CO

18

16

16

16

16

16

14

17

Nicotine

1.5

1.7

1.1

1.6

1.1

1.0

1.2

1.4

Explore! Exercises 9 and 10 provide two data sets from “Graphs in Statistical Analysis,” by F. J. Anscombe, the American Statistician, Vol. 27. For each exercise,

a. Construct a scatterplot.

b. Find the value of the linear correlation coefficient r, then determine whether there is sufficient evidence to support the claim of a linear correlation between the two variables.

c. Identify the feature of the data that would be missed if part (b) was completed without constructing the scatterplot.

x

10

8

13

9

11

14

6

4

12

7

5

y

9.14

8.14

8.74

8.77

9.26

8.10

6.13

3.10

9.13

7.26

4.74

a. What is a residual?

b. In what sense is the regression line the straight line that “best” fits the points in a scatterplot?

In exercise 10-1 12. Clusters Refer to the following Minitab-generated scatterplot. The four points in the lower left corner are measurements from women, and the four points in the upper right corner are from men.

a. Examine the pattern of the four points in the lower left corner (from women) only, and subjectively determine whether there appears to be a correlation between x and y for women.

b. Examine the pattern of the four points in the upper right corner (from men) only, and subjectively determine whether there appears to be a correlation between x and y for men.

c. Find the linear correlation coefficient using only the four points in the lower left corner (for women). Will the four points in the upper left corner (for men) have the same linear correlation coefficient?

d. Find the value of the linear correlation coefficient using all eight points. What does that value suggest about the relationship between x and y?

e. Based on the preceding results, what do you conclude? Should the data from women and the data from men be considered together, or do they appear to represent two different and distinct populations that should be analyzed separately?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free