Managing diabetes People with diabetes measure their fasting plasma glucose (FPG, measured in milligrams per milliliter) after fasting for at least 8 hours. Another measurement, made at regular medical checkups, is called HbA. This is roughly the percent of red blood cells that have a glucose molecule attached. It measures average exposure to glucose over a period of several months. The table gives data on both HbA and FPG for 18 diabetics five months after they had completed a diabetes education class.

a. Make a scatterplot with HbA as the explanatory variable. Describe what you see.

b. Subject 18 is an outlier in the x-direction. What effect do you think this subject has on the correlation? What effect do you think this subject has on the equation of the least-squares regression line? Calculate the correlation and equation of the least-squares regression line with and without this subject to confirm your answer.

c. Subject 15 is an outlier in the y-direction. What effect do you think this subject has on the correlation? What effect do you think this subject has on the equation of the least-squares regression line? Calculate the correlation and equation of the least-squares regression line with and without this subject to confirm your answer.

Short Answer

Expert verified

Part (a) the scatterplot confirms a weak relationship because the points seem to lie far apart.

Part (b) No, they do not affect.

Part (c) It makes the regression line steeper.

Step by step solution

01

Part (a) Step 1: Given information

02

Part (a) Step 2: Explanation

The scatterplot with HbA as the explanatory variable is as:

Because the scatterplot slopes upwards, we can conclude that the scatterplot confirms a positive linear connection. Because the points appear to be widely apart, the scatterplot indicates a weak association.

03

Part (b) Step 1: Calculation

Now we must use the excel function to calculate the correlation:

First, we'll enter the data into an excel file, and then we'll utilize the correlation function, which is,

CORREL function returns the correlation coefficient of the array1andarray2cell ranges. Thus, the syntax is as:

CORREL(array1,array2)

AVERAGE function returns the average of the array1andarray2cell ranges. The syntax is as:

AVERAGE(array1,array2)

For the case with outlier:

Thus, the calculation will be as:

Correlation=CORREL(H1:H18,I1:I18)

And the result will be as:

Correlation=0.4506

Thus, the slope will be,

b=rsysx=0.4506×81.48273.2619=11.2569

And the y-intercept will be,

a=ybx=172.384611.2569×10.3769=55.5726

The regression line will be as:

y=55.5726+11.2569x

For the case without outlier:

Thus, the calculation will be as:

Correlation=CORREL(H1:H17,I1:I17)

And the result will be as:

Correlation0.3837

Thus, the slope will be,

b=rsysx=0.3837×68.90202.1821=12.1158

And they-intercept will be,

a=ybx=157.882412.1156×8.7176=52.2615

04

Part (b) Step 2: Calculation

The regression line will be as:

y=55.5726+12.1158x

As a result, we can see that the correlation coefficient with the outlier is greater than the correlation coefficient without it. Due to the fact that subject 18 follows the same linear trend as the other points in the scatterplot, the outlier enhances the correlation. We then notice that the two regression lines in the scatterplot are nearly identical, implying that the outlier has little effect on the regression line.

05

Part (c) Step 1: Explanation

Now we must use the excel function to calculate the correlation:

First, we'll enter the data into an excel file, and then we'll utilize the correlation function, which is,

CORREL function returns the correlation coefficient of the array1and array2cell ranges. Thus, the syntax is as:

CORREL(array1,array2)

AVERAGE function returns the average of the array1 and array2 cell ranges. The syntax is as:

AVERAGE (array1,array2)

For the case with outlier:

Thus, the calculation will be as:

Correlation=CORREL(H1:H18,I1:I18)

And the result will be as:

Correlation0.4506

Thus, the slope will be,

b=rsysx=0.4506×81.48273.2619=11.2569

And the y-intercept will be,

a=ybx=172.384611.2569×10.3769=55.5726

The regression line will be as:

y=55.5726+11.2569x

For the case without outlier:

Thus, the calculation will be as:

Correlation=CORREL(H1:H17,I1:I17)

And the result will be as:

Correlation0.5684

Thus, the slope will be,

b=rsysx=0.5684×52.62313.3531=8.9204

And the y-intercept will be,

a=ybx=151.76478.9204×9.2235=69.4872

06

Part (c) Step 2: Explanation

The regression line will be

y=69.4872+8.9204x

As a result, we can see that the correlation coefficient with the outlier is lower than without the outlier. We then notice that the outlier reduces the correlation since subject 15 deviates from the general linear pattern in the other scatterplot points. Then we see that the regression line with the outlier is steeper than the regression line without the outlier, implying that the outlier causes the regression line to be steeper.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

More crying? Refer to Exercise 16Does the fact that r=0.45 suggest that making an infant cry will increase his or her IQ later in life? Explain your reasoning.

Does fast driving waste fuel? How does the fuel consumption of a car change as its speed increases? Here are data for a British Ford Escort. Speed is measured in kilometers per hour and fuel consumption is measured in liters of gasoline used per 100 kilometers traveled.

a. Make a scatterplot to display the relationship between speed and fuel consumption.

b. Describe the relationship between speed and fuel consumption.

Rank the correlations Consider each of the following relationships: the heights of fathers and the heights of their adult sons, the heights of husbands and the heights of their wives, and the heights of women at age 4 and their heights at age 18. Rank the correlations Page Number: 174between these pairs of variables from largest to smallest. Explain your reasoning.

Marijuana and traffic accidents (1.1) Researchers in New Zealand interviewed 907 drivers at age 21. They had data on traffic accidents and they asked the drivers about marijuana use. Here are data on the numbers of accidents caused by these drivers at age 19, broken down by marijuana use at the same age:

a. Make a graph that displays the accident rate for each category of marijuana use. Is there evidence of an association between marijuana use and traffic accidents? Justify your answer.

b. Explain why we can’t conclude that marijuana use causes accidents based on this study.

Scientists examined the activity level of 7 fish at different temperatures. Fish activity was rated on a scale of 0 (no activity) to 100 (maximal activity). The temperature was measured in degrees Celsius. A computer regression printout and a residual plot are provided. Notice that the horizontal axis on the residual plot is labeled “Fitted value,” which means the same thing as “predicted value.”

What is the correlation between temperature and fish activity?

a. 0.95

b. 0.91

c. 0.45

d. –0.91

e. –0.95

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free