Question: Forecasting daily admission of a water park (cont’d). Refer to Exercise 12.165. The owners of the water adventure park are advised that the prediction model could probably be improved if interaction terms were added. In particular, it is thought that the rate at which mean attendance increases as predicted high temperature increases will be greater on weekends than on weekdays.

The following model is therefore proposed:

E(y)=β0+β1x1+β2x2+β3x3+β4x1x3

The same 30 days of data used in Exercise 12.165 are again used to obtain the least squares model,y^=250-700x1+100x2+5x3+15x1x3 with sβ4=3,R2=0.96.

a. Graph the predicted day’s attendance, y, against the day’s predicted high temperature,, for a sunny weekday and for a sunny weekend day. Plot both on the same graph forbetweenand. Note the increase in slope for the weekend day. Interpret this.

b. Do the data indicate that the interaction term is a useful addition to the model? Useα=.05.

c. Use this model to predict the attendance for a sunny weekday with a predicted high temperature of95°F.

d. Suppose the 90%prediction interval for part c is (800, 850). Compare this result with the prediction interval for the model without interaction in Exercise 12.165, part e. Do the relative widths of the confidence intervals support or refute your conclusion about the utility of the interaction term (part b)?

e. The owners, noting that the coefficientβ^1=-700, conclude the model is ridiculous because it seems to imply that the mean attendance will be 700 less on weekends than on weekdays. Explain why this is not the case.

Short Answer

Expert verified

Answer

a. Graph of the predicted days attendance.

b. Thevalue for the model is 0.96 indicating that 96% of the variation in the data is explained by the model meaning that the model is a good fit for the data. This means that the interaction term is useful in explaining model.

c. The predicted attendance on a sunny weekday at a temperature of95°Fis 825.

d. The 90% prediction interval for daily attendance is (800, 850) indicating that the future values of the dependent variables will fall between the interval. From part c, the value of 825 is also falling into the prediction interval.

e. The coefficient of x1 is -700 indicating that there is an inverse relation between attendance and days of the week. The number 700 indicates that for every 1 unit change in attendance, the no of days’ changes by 700.

Step by step solution

01

Given Information

The least square regression equation is:

y^=250-700x1+100x2+5x3+15x1x3

02

Graph

a.

The question involves interpretingvalues which represents the fraction of the sample variation of the y-values (measured by) that is explained by the least squares prediction equation.

The graph can be drawn by taking individual values of y and , for second line y and to understand the effect of individual x variables on the dependent variable y.

03

Interaction term

b.

TheR2 value for the model is 0.96 indicating that 96% of the variation in the data is explained by the model meaning that the model is a good fit for the data. This means that the interaction term is useful in explaining model.

04

Prediction value

c.

The regression equation is y^=250-700x1+100x2+5x3+15x1x3.The prediction value of daily attendance on sunny weekday at 95⁰F can be calculated when , x1=1,x2=1andx3=95.

y^=250-7000+1000+595+1501y^=825

Therefore, the predicted attendance on a sunny weekday at a temperature of 95°Fis 825.

05

Prediction interval 

d.

The 90% prediction interval for daily attendance is (800, 850) indicating that the future values of the dependent variables will fall between the interval. From part c, the value of 825 is also falling into the prediction interval.

06

Implication of coefficient of x1

e.

The coefficient of is -700 indicating that there is an inverse relation between attendance and days of the week. The number 700 indicates that for every 1 unit change in attendance, the no of days’ decreases by 700.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Question: Adverse effects of hot-water runoff. The Environmental Protection Agency (EPA) wants to determine whether the hot-water runoff from a particular power plant located near a large gulf is having an adverse effect on the marine life in the area. The goal is to acquire a prediction equation for the number of marine animals located at certain designated areas, or stations, in the gulf. Based on past experience, the EPA considered the following environmental factors as predictors for the number of animals at a particular station:

X1 = Temperature of water (TEMP)

X2 = Salinity of water (SAL)

X3 = Dissolved oxygen content of water (DO)

X4 = Turbidity index, a measure of the turbidity of the water (TI)

x5 = Depth of the water at the station (ST_DEPTH)

x6 = Total weight of sea grasses in sampled area (TGRSWT)

As a preliminary step in the construction of this model, the EPA used a stepwise regression procedure to identify the most important of these six variables. A total of 716 samples were taken at different stations in the gulf, producing the SPSS printout shown below. (The response measured was y, the logarithm of the number of marine animals found in the sampled area.)

a. According to the SPSS printout, which of the six independent variables should be used in the model? (Use α = .10.)

b. Are we able to assume that the EPA has identified all the important independent variables for the prediction of y? Why?

c. Using the variables identified in part a, write the first-order model with interaction that may be used to predict y.

d. How would the EPA determine whether the model specified in part c is better than the first-order model?

e.Note the small value of R2. What action might the EPA take to improve the model?

Consider relating E(y) to two quantitative independent variables x1 and x2.

  1. Write a first-order model for E(y).

  2. Write a complete second-order model for E(y).

Question: Suppose the mean value E(y) of a response y is related to the quantitative independent variables x1and x2

E(y)=2+x1-3x2-x1x2

a. Identify and interpret the slope forx2.

b. Plot the linear relationship between E(y) andx2forx1=0,1,2, where.

c. How would you interpret the estimated slopes?

d. Use the lines you plotted in part b to determine the changes in E(y) for each x1=0,1,2.

e. Use your graph from part b to determine how much E(y) changes when3x15and1x23.

Question: Tipping behaviour in restaurants. Can food servers increase their tips by complimenting the customers they are waiting on? To answer this question, researchers collected data on the customer tipping behaviour for a sample of 348 dining parties and reported their findings in the Journal of Applied Social Psychology (Vol. 40, 2010). Tip size (y, measured as a percentage of the total food bill) was modelled as a function of size of the dining party(x1)and whether or not the server complimented the customers’ choice of menu items (x2). One theory states that the effect of the size of the dining party on tip size is independent of whether or not the server compliments the customers’ menu choices. A second theory hypothesizes that the effect of size of the dining party on tip size is greater when the server compliments the customers’ menu choices as opposed to when the server refrains from complimenting menu choices.

a. Write a model for E(y) as a function of x1 and x2 that corresponds to Theory 1.

b. Write a model for E(y) as a function of x1and x2that corresponds to Theory 2.

c. The researchers summarized the results of their analysis with the following graph. Based on the graph, which of the two models would you expect to fit the data better? Explain.

Question: Bordeaux wine sold at auction. The uncertainty of the weather during the growing season, the phenomenon that wine tastes better with age, and the fact that some vineyards produce better wines than others encourage speculation concerning the value of a case of wine produced by a certain vineyard during a certain year (or vintage). The publishers of a newsletter titled Liquid Assets: The International Guide to Fine Wine discussed a multiple regression approach to predicting the London auction price of red Bordeaux wine. The natural logarithm of the price y (in dollars) of a case containing a dozen bottles of red wine was modelled as a function of weather during growing season and age of vintage. Consider the multiple regression results for hypothetical data collected for 30 vintages (years) shown below.

  1. Conduct a t-test (atα=0.05 ) for each of the βparameters in the model. Interpret the results.
  2. When the natural log of y is used as a dependent variable, the antilogarithm of a b coefficient minus 1—that is ebi - 1—represents the percentage change in y for every 1-unit increase in the associated x-value. Use this information to interpret each of the b estimates.
  3. Interpret the values of R2and s. Do you recommend using the model for predicting Bordeaux wine prices? Explain

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free