T12.12 Foresters are interested in predicting the amount of usable lumber they can harvest from various tree species. They collect data on the diameter at breast height (DBH) in inches and the yield in board feet of a random sample of 20 Ponderosa pine trees that have been harvested. (Note that a board foot is defined as a piece of lumber 12 inches by 12 inches by 1 inch.) Here is a scatterplot of the data.

a. Here is some computer output and a residual plot from a least-squares regression on these data. Explain why a linear model may not be appropriate in this case.

The foresters are considering two possible transformations of the original data: (1) cubing the diameter values or (2) taking the natural logarithm of the yield measurements. After transforming the data, a least-squares regression analysis is performed. Here is some computer output and a residual plot for each of the two possible regression models:

b. Use both models to predict the amount of usable lumber from a Ponderosa pine with diameter 30 inches.
c. Which of the predictions in part (b) seems more reliable? Give appropriate evidence to support your choice.

Short Answer

Expert verified

(a) The pattern in the residual plot involves substantial curvature, a linear model will not be appropriate because the variables have a curved connection.

(b) The predicted yield for option 1is 117.0899board feet and the predicted yield for option 2 is 102.967board feet.

(c) Option 1 is the better option for prediction.

Step by step solution

01

Part (a) Step 1: Given information

To determine that a linear model may not be appropriate in this case.

02

Part (a) Step 2: Explanation

Foresters want to know how much useful lumber they'll be able to get from different tree species.
They took measurements of a random sample of Ponderosa pine trees' diameter at breast height in inches and yield in broad feet.
In the question, will find the computer output as well as a residual plot from least square regression.
Because the pattern in the residual plot involves substantial curvature, a linear model will not be acceptable because the variables have a curved connection.

03

Part (b) Step 1: Given information

To use both models to predict the amount of usable lumber from a Ponderosa pine with diameter 30 inches.

04

Part (b) Step 2: Explanation

Foresters want to know how much useful lumber they will be able to collect from different tree types. They measured the diameter of a random sample of Ponderosa pine trees at breast height in inches and the yield in broad feet. The question includes the computer results as well as a residual graphic from a least square regression. The foresters are exploring cubing the diameter values or taking the natural logarithm of the yields measurements as two feasible modifications of the original data.
As a result, the general equation of the least square regression line for option 1 is:
y^=b0+b1x
The value of the constant b0 is calculated as follows in the computer output's row "Constant" and column "Coef":
b0=2.078
The value of the constant b1is calculated as follows in the computer output's row "DBH3" and column "Coef":
b1=0.0042597

In the general equation, replace b0with 2.078and b1with b1=0.0042597.
y^=b0+b1x
y^=2.078+0.0042597x

Hence the cubic equation is calculated as:

y^=2.078+0.0042597x3

Substitute xfor 30:

y^=2.078+0.0042597x3

=2.078+0.0042597(30)3

=117.0899

As a result, the predicted yield is 117.0899board feet.

05

Part (b) Step 3: Explanation

Then, the general equation of the least square regression line for option 2 is:
y^=b0+b1x
The value of the constant $b 0$ is calculated as follows in the computer output's row "Constant" and column "Coef":
b0=1.2319

The value of the constant $b 1$ is calculated as follows in the computer output's row "DBH" and column "Coef":
b1=0.113417
In the general equation, replace $b_{0}=1.2319$ and $b_{1}$ with $b_{1}=0.113417$,
y^=b0+b1x
y^=1.2319+0.113417x
Use the logarithm in the equation:
lny^=1.2319+0.113417x
Then multiply xby 30 to get:
lny^=1.2319+0.113417x
=1.2319+0.113417(30)
=4.63441
Take each side's exponential:
y^=elny^
=e4.63441
=102.967

As a result, the predicted yield is 102.967 board feet.

06

Part (c)  Step 1: Given information

To find the predictions in part (b) seems more reliable and to explain with appropriate evidence.

07

Part (c) Step 2: Explanation

Foresters want to know how much useful lumber they will be able to collect from different tree types. They measured the diameter of a random sample of Ponderosa pine trees at breast height in inches and the yield in broad feet.
The question includes the computer results as well as a residual graphic from a least square regression. The foresters are exploring cubing the diameter values or taking the natural logarithm of the yields measurements as two feasible modifications of the original data.
As a result, the residual plot of option 1 has no strong curvature, whereas the residual plot of option 2 has strong curvature.
Also, the model in option 1is appropriate for making predictions, but the model in option 2 is not.
Therefore, estimated that forecast using option 1 will be more accurate.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

The town council wants to estimate the proportion of all adults in their medium-sized town who favor a tax increase to support the local school system. Which of the

following sampling plans is most appropriate for estimating this proportion?

a. A random sample of 250names from the local phone book

b. A random sample of 200parents whose children attend one of the local schools

c. A sample consisting of 500people from the city who take an online survey about the issue

d. A random sample of 300homeowners in the town

e. A random sample of 100people from an alphabetical list of all adults who live in the town

Exercises T12.4–T12.8 refer to the following setting. An old saying in golf is “You drive for show and you putt for dough.” The point is that good putting is more important than long driving for shooting low scores and hence winning money. To see if this is the case, data from a random sample of 69 of the nearly 1000 players on the PGA Tour’s world money list are examined. The average number of putts per hole (fewer is better) and the player’s total winnings for the previous season are recorded and a least-squares regression line was fitted to the data. Assume the conditions for
inference about the slope are met. Here is computer output from the regression analysis:

T12.6 The P -value for the test in Exercise T12.5 is 0.0087. Which of the following is a correct interpretation of this result?
a. The probability there is no linear relationship between average number of putts per hole and total winnings for these 69 players is 0.0087.
b. The probability there is no linear relationship between average number of putts per hole and total winnings for all players on the PGA Tour’s world money list is 0.0087.
c. If there is no linear relationship between average number of putts per hole and total winnings for the players in the sample, the probability of getting a random sample of 69 players that yields a least-squares regression line with a slope of −4,139,198 or less is 0.0087.
d. If there is no linear relationship between average number of putts per hole and total winnings for the players on the PGA Tour’s world money list, the probability of getting a random sample of 69 players that yields a least-squares regression line with a slope of −4,139,198 or less is 0.0087.
e. The probability of making a Type I error is 0.0087.

A scatterplot of yversus xshows a positive, nonlinear association. Two different transformations are attempted to try to linearize the association: using the logarithm of the y-values and using the square root of the y-values. Two least-squares regression lines are calculated, one that uses x to predict log(y) and the other that uses x to predict y. Which of the following would be the best reason to prefer the least-squares regression line that uses x to predict log(y)?

a. The value of r2is smaller.

b. The standard deviation of the residuals is smaller.

c. The slope is greater.

d. The residual plot has more random scatter.

e. The distribution of residuals is more Normal.

Do taller students require fewer steps to walk a fixed distance? The scatterplot shows the relationship between x=height (in inches) and y=number of steps required to walk the length of a school hallway for a random sample of 36 students at a high school.

A least-squares regression analysis was performed on the data. Here is some computer output from the analysis

Long legs Do these data provide convincing evidence at the α=0.05level that taller students at this school require fewer steps to walk a fixed distance? Assume that the conditions for inference are met.

Could mud wrestling be the cause of a rash contracted by University of Washington students? Two physicians at the university’s student health center wondered about this when one male and six female students complained of rashes after participating in a mud-wrestling event. Questionnaires were sent to a random sample of students who participated in the event. The results, by gender, are summarized in the following table.

Here is some computer output for the preceding table. The output includes the observed counts, the expected counts, and the chi-square statistic.

The cell that contributes most to the chi-square statistic is

a. men who developed a rash.

b. men who did not develop a rash.

c. women who developed a rash.

d. women who did not develop a rash.

e. both (a) and (d).

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free