Testing for a Linear Correlation. In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of A = 0.05. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.)

Tips Listed below are amounts of bills for dinner and the amounts of the tips that were left. The data were collected by students of the author. Is there sufficient evidence to conclude that there is a linear correlation between the bill amounts and the tip amounts? If everyone were to tip with the same percentage, what should be the value of r?

Bill(dollars)

33.46

50.68

87.92

98.84

63.6

107.34

Tip(dollars)

5.5

5

8.08

17

12

16

Short Answer

Expert verified

The scatterplot is shown below:

The value ofthe correlation coefficient is 0.828.

Thus, the p-value is 0.042.

There is sufficient evidence to support the existence of a linear correlation between bills and tip amounts.

The value of correlation willbe one if a specific percentage of the tip is offered.

Step by step solution

01

Given information

The data is listedfor the amounts of bills and the tips given.

Bill (dollars)

Tip

(dollars)

33.46

5.5

50.68

5

87.92

8.08

98.84

17

63.6

12

107.34

16

02

Sketch a scatterplot

A scatterplot is a two-dimensional graph with dots marked for each variable in the paired form.

Steps to sketch a scatterplot:

  1. Describe theaxes for bill and tip.
  2. Mark the paired observations on the graph.

The scatterplotis shown below.

03

Compute the measure of the correlation coefficient

The correlation coefficient formula is

\(r = \frac{{n\sum {xy} - \left( {\sum x } \right)\left( {\sum y } \right)}}{{\sqrt {n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}} \sqrt {n\left( {\sum {{y^2}} } \right) - {{\left( {\sum y } \right)}^2}} }}\).

Let variable x be bill amounts, and y be the amount of tips.

The valuesare listed in the table below:

x

y

\({x^2}\)

\({y^2}\)

\(xy\)

33.46

5.5

1119.57

30.25

184.03

50.68

5

2568.46

25

253.4

87.92

8.08

7729.93

65.2864

710.40

98.84

17

9769.35

289

1680.28

63.6

12

4044.96

144

763.2

107.34

16

11521.88

256

1717.44

\(\sum x = 441.84\)

\(\sum y = 63.58\)

\(\sum {{x^2}} = 36754.14\)

\(\sum {{y^2} = } \;809.5364\)

\(\sum {xy\; = \;} 5308.744\)

Substitute the values in the formula:

\(\begin{aligned} r &= \frac{{6\left( {5308.744} \right) - \left( {441.84} \right)\left( {63.58} \right)}}{{\sqrt {6\left( {36754.14} \right) - {{\left( {441.84} \right)}^2}} \sqrt {6\left( {809.5364} \right) - {{\left( {63.58} \right)}^2}} }}\\ &= 0.828\end{aligned}\)

Thus, the correlation coefficient is 0.828.

04

Step 4:Conduct a hypothesis test for correlation

Definethe measure\(\rho \)as the true correlation measure for the two variables.

For testing the claim, form the hypotheses:

\(\begin{array}{l}{H_o}:\rho = 0\\{H_a}:\rho \ne 0\end{array}\)

The samplesize is6(n).

The test statistic is computed as follows:

\(\begin{aligned} t &= \frac{r}{{\sqrt {\frac{{1 - {r^2}}}{{n - 2}}} }}\\ &= \frac{{0.828}}{{\sqrt {\frac{{1 - {{\left( {0.828} \right)}^2}}}{{6 - 2}}} }}\\ &= 2.953\end{aligned}\)

Thus, the test statistic is 2.953.

The degree of freedom is calculated below:

\(\begin{aligned} df &= n - 2\\ &= 6 - 2\\ &= 4\end{aligned}\)

The p-value is computed from the t-distribution table.

\(\begin{aligned} p - value &= 2P\left( {T > 2.953} \right)\\ &= 0.042\end{aligned}\)

Thus, the p-value is 0.042.

Since the p-value is lesser than 0.05, the null hypothesis is rejected.

Therefore, there is sufficient evidence to support alinear correlation between theamount of bill and the tip left.

05

Estimate the value of r under a condition

Assume each person offers a tip with the same percentage (say c percent) of the bill amount.

Then,

\({\rm{Tips}}\left( y \right) = {\rm{c}}\% \;{\rm{of}}\;{\rm{bill}}\left( x \right) \Rightarrow y = \frac{c}{{100}} \times x\).

Therefore, the equation that best explains the relationship is of a straight line with slope\(\frac{c}{{100}}\). The general equation of the line is\(y = ax + b\)for slope a and intercept b.

Hence, all observations willlie on the line.

The value of rwill be one as all observations are collinear. It means, a perfect linear relationship is established between xand y.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Ages of MoviegoersThe table below shows the distribution of the ages of moviegoers(based on data from the Motion Picture Association of America). Use the data to estimate themean, standard deviation, and variance of ages of moviegoers.Hint:For the open-ended categoryof “60 and older,” assume that the category is actually 60–80.

Age

2-11

12-17

18-24

25-39

40-49

50-59

60 and older

Percent

7

15

19

19

15

11

14

Prediction Interval Using the heights and weights described in Exercise 1, a height of 180 cm is used to find that the predicted weight is 91.3 kg, and the 95% prediction interval is (59.0 kg, 123.6 kg). Write a statement that interprets that prediction interval. What is the major advantage of using a prediction interval instead of simply using the predicted weight of 91.3 kg? Why is the terminology of prediction interval used instead of confidence interval?

In Exercises 5–8, use a significance level of A = 0.05 and refer to the

accompanying displays.

Casino Size and Revenue The New York Times published the sizes (square feet) and revenues (dollars) of seven different casinos in Atlantic City. Is there sufficient evidence to support the claim that there is a linear correlation between size and revenue? Do the results suggest that a casino can increase its revenue by expanding its size?

Confidence Intervals for a Regression Coefficients A confidence interval for the regression coefficient b1 is expressed

\(\begin{array}{l}{b_1} - E < {\beta _1} < {b_1} + E\\\end{array}\)

Where

\(E = {t_{\frac{\alpha }{2}}}{s_{{b_1}}}\)

The critical t score is found using n –(k+1) degrees of freedom, where k, n, and sb1 are described in Exercise 17. Using the sample data from Example 1, n = 153 and k = 2, so df = 150 and the critical t scores are \( \pm \)1.976 for a 95% confidence level. Use the sample data for Example 1, the Stat diskdisplay in Example 1 on page 513, and the Stat Crunchdisplay in Exercise 17 to construct 95% confidence interval estimates of \({\beta _1}\) (the coefficient for the variable representing height) and\({\beta _2}\) (the coefficient for the variable representing waist circumference). Does either confidence interval include 0, suggesting that the variable be eliminated from the regression equation?

The following exercises are based on the following sample data consisting of numbers of enrolled students (in thousands) and numbers of burglaries for randomly selected large colleges in a recent year (based on data from the New York Times).

If you had computed the value of the linear correlation coefficient to be 1.500, what should you conclude?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free