Managing diabetes People with diabetes measure their fasting plasma glucose (FPG; measured in units of milligrams per milliliter) after fasting for at least 8 hours. Another measurement, made at regular medical checkups, is called HbA. This is roughly the percent of red blood cells that have a glucose

molecule attached. It measures average exposure to glucose over a period of several months. The table below gives data on both HbA and FPG for 18 diabetics five months after they had completed a diabetes education class.

(a) Make a scatterplot with HbA as the explanatory variable. There is a positive linear relationship, but it is surprisingly weak.

(b) Subject 15 is an outlier in the y-direction. Subject 18 is an outlier in the x-direction. Find the correlation for all 18 subjects, for all except Subject 15 and

for all except Subject 18 Are either or both of these subjects influential for the correlation? Explain in simple language why r changes in opposite directions when we remove each of these points.

(c) Add three regression lines for predicting FPG from HbA to your scatterplot: for all 18 subjects, for all except Subject 15 and for all except Subject 18

Is either Subject 15 or Subject 18 strongly influential for the least-squares line? Explain in simple language what features of the scatterplot explain the degree of influence.

Short Answer

Expert verified

Part (b) The correlation r with all 18 subjects is r=0.482

The correlation r without subject 15 is r=0.568

The correlation r without subject 18 is r=0.384

Part (c) Subject 15 and subject 18 both are influential.

Part (a)

Step by step solution

01

Part (a) Step 1: Given information

SubjectHb1AFPGSubjectHbAFPG
16.1141108.7172
26.3
158119.4200
36.41121210.4271
46.81531310.6103
57.01341410.7172
67.1951510.7359
77.5961611.2145
87.7781713.7147
97.91481819.3255
02

Part (a) Step 2: Concept

Linear regression is commonly used for predictive analysis and modeling.

03

Part (a) Step 3: Explanation

Set the horizontal axis for HbA (the explanatory variable) and the vertical axis for FPG (the response variable).

The scatterplot for the supplied data is presented below using the MINITAB:

The general pattern moves from the bottom left to the higher right, as shown in the graph. That is, people with a higher HbA have a higher FPG. This is referred to as a positive relationship between the two variables. The relationship is linear in nature. That example, the general pattern runs from bottom left to higher right in a straight line. Because the points deviate greatly from the line and there are some outliers, the relationship is weak. Therefore, the required scatterplot is drawn.

04

Part (b) Step 1: Calculation

The correlation r with all 18 individuals using the MINITAB is r=0.482

Without subject 15 the correlation coefficient is r=0.568

Without subject 18 the correlation coefficient isr=0.384

Without subject 15 and without subject 18 the correlation is r=0.324

The Correlation increases by 0.086 after outlier subject 15 is removed. However, removing subject 15 from the equation has no influence on the association. Because of subject 15's extreme position on the HbA scale, the position of the regression line is strongly influenced by this point. The Correlation drops by 0.098 when the outlier subject18 is removed. One outlier can be wholly responsible for a high correlation value that would otherwise be quite low (without the outlier). Needless to note, major decisions should never be made solely on the basis of the correlation coefficient's value (i.e., examining the respective scatterplot is always recommended). These are known as 'good' outliers. Both subjects 15 and 18 have an impact since the linear correlation coefficient varies dramatically when they are combined.

Therefore,

The correlation r with all 18 subjects is r=0.482

The correlation r without subject 15 is r=0.568

The correlation r without subject 18 is r=0.384

05

Part (c) Step 1: Explanation

The least-square lines with all 18topics, without subject 15 and without subject 18 are shown in the diagram below.

The relevance of subject 18 can be shown here. This point can be considered an excellent outlier because it spreads the pattern to the top right. When this point is removed, the correlation decreases since the remaining points exhibit no discernible pattern. Because this point is so distant from the regression line, Subject 15 has a very large residual. Least-squares lines minimize the sum of squares of the vertical distances between the points. The line is pulled toward itself by a point that is extreme in the X direction and has no other points nearby. It's known as influential spots. It lowers the line's incline. Therefore, subject 15 and subject 18 both are influential.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Data on dating A student wonders if tall women tend to date taller men than do short women. She measures herself, her dormitory roommate, and the women in the adjoining rooms. Then she measures the next man each woman dates. Here are the data (heights in inches):

(a) Make a scatterplot of these data. Based on the scatterplot, do you expect the correlation to be positive or negative? Near ±1or not?

(b) Find the correlation r step-by-step. First, find the mean and standard deviation of each variable. Then find the six standardized values for each variable. Finally, use the formula for r. Do the data show that taller women tend to date taller men?

Southern education For a long time, the South has lagged behind the rest of the United States in the performance of its schools. Efforts to improve education have reduced the gap. We wonder if the South stands out in our study of state average SAT Math scores.

The figure below enhances the scatterplot in Figure 3.2(page 144) by plotting 12southern states in red.

(a) What does the graph suggest about the southern states?

(b) The point for West Virginia is labeled in the graph. Explain how this state is an outlier.

Merlins breeding The percent of an animal species in the wild that survives to breed again is often lower following a successful breeding season. A study of merlins (small falcons) in northern Sweden observed the number of breeding pairs in an isolated area and the percent of males (banded for identification) that returned the next breeding season. Here are data for nine years:

Investigate the relationship between breeding pairs and percent return. Follow the four-step process.

Teaching and research A college newspaper interviews a psychologist about student ratings of the teaching of faculty members. The psychologist says, “The evidence indicates that the correlation between the research productivity and teaching rating of faculty members is close to zero.” The paper reports this as “Professor McDaniel said that good researchers tend to be poor teachers, and vice versa.” Explain why the paper’s report is wrong. Write a statement in plain language (don’t use the word “correlation”) to explain the psychologist’s meaning.

Acid rain Refer to Exercise 39. Would it be appropriate to use the regression line to predict pH after1000 months? Justify your answer.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free