The following table gives data on the mean number of seeds produced in a year by several common tree species and the mean weight (in milligrams) of the seeds produced. (Some species appear twice because their seeds were counted in two locations.) We might expect that trees with heavy seeds produce fewer of them, but what mathematical model best describes the relationship?

(a) Based on the scatterplot below, is a linear model appropriate to describe the relationship between seed count and seed weight? Explain

(b) Two alternative models based on transforming the original data are proposed to predict the seed weight from the seed count. Graphs and computer output from a least-squares regression analysis on the transformed data are shown below.

Model A:

Model B:

Which model, A or B, is more appropriate for predicting seed weight from seed count? Justify your answer.

(c) Using the model you chose in part (b), predict the seed weight if the seed count is 3700.

(d) Interpret the value ofr2 for your model.

Short Answer

Expert verified

(a) No, the linear model is appropriate to describe the relationship between seed count and seed weight .

(b) The answer to this part is given below.

(c) The seed weight if the seed count is 3700is 19.7760mg

(d) The value of r2for model is86.3%.

Step by step solution

01

Part (a) step 1: Given Information

We need to find the linear model appropriate to describe the relationship between seed count and seed weight or not.

02

Part (a) step 2: Explanation

No, The scatterplot shows a strong curved pattern.

03

Part (b) step 1: Given Information

We need to graphs and computer output from a least-squares regression analysis on the transformed data are shown.

04

Part (b) step 2:Explanation

Its scatterplot shows a more linear pattern and its residual plot shows no visible pattern.

05

Part (c) step 1: Given Information

We need to find the seed weight if the seed count is3700.

06

Part (c) step 2: Explanation

On the scatterplot, we that the variable on the horizontal axis is "ln(count)" and the variable on the vertical axis is "ln(weight)", so the variable is " while the -variable is " . The general regression equation is then:

lnweight=a+blncount

The constant is given in the row "Constant" and in the column "Coef" of the computer output of model B:

a=15.491

The slope is given in the row "ln(count)" and in the column "Coef" of the computer output of model B:

b=1.5222

Replacing with 15.491and with 1.5222in the general regression equation, we have:

lnweight=15.4911.5222lncount

Replace "count" by 3700and evaluate:

lnweight=15.4911.5222ln37002.9845

Take the exponential of each side:

weight=elnweight=e2.984519.7760

Thus the predicted weight is19.7760mg.

07

Part (d) step 1: Given Information

We need to find the value ofr2for in model.

08

Part (d) step 2: Explanation

About 86.3%of the variation in In(seed weight) is accounted for by the linear modal relating In(seed weight ) toIn(seed count.)

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Which of the following is a categorical variable?

(a) The weight of automobiles

(b) The time required to complete the Olympic marathon

(c) The average gas mileage of a hybrid car

(d) The brand of shampoo purchased by shoppers in a

grocery store

(e) The average closing price of a particular stock on the

New York Stock Exchange

Which of the following is not one of the conditions that must be satisfied in order to perform inference about the slope of a least-squares regression line? (a) For each value of x, the population of y-values is Normally distributed.

(b) The standard deviation σof the population of y-values corresponding to a particular value of x is always the same, regardless of the specific value of x. (c) The sample size—that is, the number of paired observations (x, y)—exceeds 30.

(d) There exists a straight line y=α+βxsuch that, for each value of x, the mean μyof the corresponding population of y-values lies on that straight line.

(e) The data come from a random sample or a randomized experiment.

A study of road rage asked random samples of 596men and 523women about their behavior while driving. Based on their answers, each respondent was assigned a road rage score on a scale of 0to 20. The respondents were chosen by random digit dialing of telephone numbers. Are the conditions for two-sample t inference satisfied?

(a) Maybe. The data came from independent random samples, but we need to examine the data to check for Normality.

(b) No. Road rage scores in a range between 0and 20can’t be Normal.

(c) No. A paired t test should be used in this case.

(d) Yes. The large sample sizes guarantee that the corresponding population distributions will be Normal.

(e) Yes. We have two independent random samples and large sample sizes, and the 10%condition is met.

Color words (9.3) Explain why it is not safe to use paired t procedures to do inference about the difference in the mean time to complete the two tasks.

Snowmobiles (11.2) Do these data provide convincing evidence of an association between environmental club membership and snowmobile use for the population of visitors to Yellowstone National Park? Carry out an appropriate test at the 5% significance level

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free