In Exercises 9 and 10, use the given data to find the equation of the regression line. Examine the scatterplot and identify a characteristic of the data that is ignored by the regression line.

Short Answer

Expert verified

The regression equation is\(\hat y = 3.00 + 0.500x\).

The scatterplot is:

The data does not follow a pattern of a straight line pattern.

Step by step solution

01

Given information

Values are given for two variables namely, x and y.

02

Calculate the mean of x and y

Themean valueof xis given as,

\(\begin{array}{c}\bar x = \frac{{\sum\limits_{i = 1}^n {{x_i}} }}{n}\\ = \frac{{10 + 8 + .... + 5}}{{11}}\\ = 9\end{array}\)

Therefore, the mean value of x is 9.

Themean value of yis given as,

\(\begin{array}{c}\bar y = \frac{{\sum\limits_{i = 1}^n {{y_i}} }}{n}\\ = \frac{{9.14 + 8.14 + .... + 4.74}}{{11}}\\ = 7.5009\end{array}\)

Therefore, the mean value of y is 7.5009.

03

Calculate the standard deviation of x and y

The standard deviation of x is given as,

\(\begin{array}{c}{s_x} = \sqrt {\frac{{\sum\limits_{i = 1}^n {{{({x_i} - \bar x)}^2}} }}{{n - 1}}} \\ = \sqrt {\frac{{{{\left( {10 - 9} \right)}^2} + {{\left( {8 - 9} \right)}^2} + ..... + {{\left( {5 - 9} \right)}^2}}}{{11 - 1}}} \\ = 3.3166\end{array}\)

Therefore, the standard deviation of x is 3.3166.

The standard deviation of y is given as,

\(\begin{array}{c}{s_y} = \sqrt {\frac{{\sum\limits_{i = 1}^n {{{({y_i} - \bar y)}^2}} }}{{n - 1}}} \\ = \sqrt {\frac{{{{\left( {9.14 - 7.5} \right)}^2} + {{\left( {8.14 - 7.5} \right)}^2} + ..... + {{\left( {4.74 - 7.5} \right)}^2}}}{{11 - 1}}} \\ = 2.0317\end{array}\)

Therefore, the standard deviation of y is 2.0317.

04

Calculate the correlation coefficient

Thecorrelation coefficient is given as,

\(r = \frac{{n\left( {\sum {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{\sqrt {\left( {\left( {n\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}} \right)\left( {\left( {n\sum {{y^2}} } \right) - {{\left( {\sum y } \right)}^2}} \right)} }}\)

The calculations required to compute the correlation coefficient are as follows:

The correlation coefficient is given as,

\(\begin{array}{l}r = \frac{{n\left( {\sum {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{\sqrt {\left( {\left( {n\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}} \right)\left( {\left( {n\sum {{y^2}} } \right) - {{\left( {\sum y } \right)}^2}} \right)} }}\\ = \frac{{11\left( {797.59} \right) - \left( {99} \right)\left( {82.51} \right)}}{{\sqrt {\left( {\left( {11 \times 1001} \right) - {{\left( {99} \right)}^2}} \right)\left( {\left( {11 \times 660.1763} \right) - {{\left( {82.51} \right)}^2}} \right)} }}\\ = 0.8162\end{array}\)

Therefore, the correlation coefficient is 0.8162.

05

Calculate the slope of the regression line

The slope of the regression line is given as,

\(\begin{array}{c}{b_1} = r\frac{{{s_Y}}}{{{s_X}}}\\ = 0.8162 \times \frac{{2.032}}{{3.317}}\\ = 0.500\end{array}\)

Therefore, the value of slope is 0.500.

06

Calculate the intercept of the regression line

The intercept is computed as,

\(\begin{array}{c}{b_0} = \bar y - {b_1}\bar x\\ = 7.5 - \left( {0.500 \times 9} \right)\\ = 3.0009\end{array}\)

Therefore, the value of intercept is 3.00.

07

Form a regression equation

Theregression equationis given as,

\(\begin{array}{c}\hat y = {b_0} + {b_1}x\\ = 3.0009 + 0.500x\end{array}\)

Thus, the regression equation is \(\hat y = 3.00 + 0.500x\).

08

Construct a scatter plot

Use the following steps to plot a scatter plot between x and y:

  • Consider x and y.
  • Mark the values 0, 1, and so on until 10 on the vertical axis.
  • Mark the values 0, 5, and so on until 15 on the horizontal axis.
  • Plot the points on the graph corresponding to the pairs of values for the two variables.
  • Label the horizontal axis as “y” and the vertical axis as “x”.

The following scatterplot is generated:

09

State the characteristic ignored in the data

It can be observed from the above scatter plot that thepattern of observations is not of a straight line. This characteristic has been ignored.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Stocks and Sunspots. Listed below are annual high values of the Dow Jones Industrial Average (DJIA) and annual mean sunspot numbers for eight recent years. Use the data for Exercises 1–5. A sunspot number is a measure of sunspots or groups of sunspots on the surface of the sun. The DJIA is a commonly used index that is a weighted mean calculated from different stock values.

DJIA

14,198

13,338

10,606

11,625

12,929

13,589

16,577

18,054

Sunspot

Number

7.5

2.9

3.1

16.5

55.7

57.6

64.7

79.3

Hypothesis Test The mean sunspot number for the past three centuries is 49.7. Use a 0.05 significance level to test the claim that the eight listed sunspot numbers are from a population with a mean equal to 49.7.

Stocks and Sunspots. Listed below are annual high values of the Dow Jones Industrial Average (DJIA) and annual mean sunspot numbers for eight recent years. Use the data for Exercises 1–5. A sunspot number is a measure of sunspots or groups of sunspots on the surface of the sun. The DJIA is a commonly used index that is a weighted mean calculated from different stock values.

DJIA

14,198

13,338

10,606

11,625

12,929

13,589

16,577

18,054

Sunspot

Number

7.5

2.9

3.1

16.5

55.7

57.6

64.7

79.3

Confidence Interval Construct a 95% confidence interval estimate of the mean sunspot number. Write a brief statement interpreting the confidence interval.

Confidence Intervals for a Regression Coefficients A confidence interval for the regression coefficient b1 is expressed

\(\begin{array}{l}{b_1} - E < {\beta _1} < {b_1} + E\\\end{array}\)

Where

\(E = {t_{\frac{\alpha }{2}}}{s_{{b_1}}}\)

The critical t score is found using n –(k+1) degrees of freedom, where k, n, and sb1 are described in Exercise 17. Using the sample data from Example 1, n = 153 and k = 2, so df = 150 and the critical t scores are \( \pm \)1.976 for a 95% confidence level. Use the sample data for Example 1, the Stat diskdisplay in Example 1 on page 513, and the Stat Crunchdisplay in Exercise 17 to construct 95% confidence interval estimates of \({\beta _1}\) (the coefficient for the variable representing height) and\({\beta _2}\) (the coefficient for the variable representing waist circumference). Does either confidence interval include 0, suggesting that the variable be eliminated from the regression equation?

Cigarette Nicotine and Carbon Monoxide Refer to the table of data given in Exercise 1 and use the amounts of nicotine and carbon monoxide (CO).

a. Construct a scatterplot using nicotine for the xscale, or horizontal axis. What does the scatterplot suggest about a linear correlation between amounts of nicotine and carbon monoxide?

b. Find the value of the linear correlation coefficient and determine whether there is sufficient evidence to support a claim of a linear correlation between amounts of nicotine and carbon monoxide.

c. Letting yrepresent the amount of carbon monoxide and letting xrepresent the amount of nicotine, find the regression equation.

d. The Raleigh brand king size cigarette is not included in the table, and it has 1.3 mg of nicotine. What is the best predicted amount of carbon monoxide?

Tar

25

27

20

24

20

20

21

24

CO

18

16

16

16

16

16

14

17

Nicotine

1.5

1.7

1.1

1.6

1.1

1.0

1.2

1.4

Explore! Exercises 9 and 10 provide two data sets from “Graphs in Statistical Analysis,” by F. J. Anscombe, the American Statistician, Vol. 27. For each exercise,

a. Construct a scatterplot.

b. Find the value of the linear correlation coefficient r, then determine whether there is sufficient evidence to support the claim of a linear correlation between the two variables.

c. Identify the feature of the data that would be missed if part (b) was completed without constructing the scatterplot.

x

10

8

13

9

11

14

6

4

12

7

5

y

9.14

8.14

8.74

8.77

9.26

8.10

6.13

3.10

9.13

7.26

4.74

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free