Question: Bus Rapid Transit study. Bus Rapid Transit (BRT) is a rapidly growing trend in the provision of public transportation in America. The Center for Urban Transportation Research (CUTR) at the University of South Florida conducted a survey of BRT customers in Miami (Transportation Research Board Annual Meeting, January 2003). Data on the following variables (all measured on a 5-point scale, where 1 = very unsatisfied and 5 = very satisfied) were collected for a sample of over 500 bus riders: overall satisfaction with BRT (y), safety on bus (x1), seat availability (x2), dependability (x3), travel time (x4), cost (x5), information/maps (x6), convenience of routes (x7), traffic signals (x8), safety at bus stops (x9), hours of service (x10), and frequency of service (x11). CUTR analysts used stepwise regression to model overall satisfaction (y).

a. How many models are fit at step 1 of the stepwise regression?

b. How many models are fit at step 2 of the stepwise regression?

c. How many models are fit at step 11 of the stepwise regression?

d. The stepwise regression selected the following eight variables to include in the model (in order of selection): x11, x4, x2, x7, x10, x1, x9, and x3. Write the equation for E(y) that results from stepwise regression.

e. The model, part d, resulted in R2 = 0.677. Interpret this value.

f. Explain why the CUTR analysts should be cautious in concluding that the best model for E(y) has been found.

Short Answer

Expert verified

Answer

a. 11 1-variable models are fitted.

b. 10 2-variable models are fitted.

c. 1 11-variable model is fitted.

d. Through the stepwise regression, 8 variables are selected. The equation for E(y) can be written as E(y)=β0+β1x1+β2x2+β3x3+β4x4+β5x5+β7x7+β8x8.

e. The value of R2 is 0.677 indicating that approximately 67% of the variation in the regression is explained by the model. 67% is a high value indicating that the model is a good fit for the data.

f. Precautions while using stepwise model - First, an extremely large number of t-tests have been conducted, leading to a high probability of making one or more Type I or Type II errors. Second, the stepwise model does not include any higher-order or interaction terms.

Step by step solution

01

1-variable model

Since there are 11 independent variables, k no of models are 1-variable models are fitted in step 1 of stepwise regression.

So, 11 1-variable models are fitted.

02

2-variable model

Since there are 11 independent variables, (k-1) no of models are 2-variable models are fitted in step 2 of stepwise regression.

Thus, 10 2-variable models are fitted.

03

11-variable model

Since there are 11 independent variables, (k-10) no of models are 3-variable models are fitted in step 3 of stepwise regression.

Hence, 1 11-variable model is fitted.

04

Equation of the stepwise model

Through the stepwise regression, 8 variables are selected. The equation for E(y) can be written as E(y)=β0+β1x1+β2x2+β3x3+β4x4+β5x5+β7x7+β8x8.

05

Interpretation of R2

The value of R2 is 0.677 indicating that approximately 67% of the variation in the regression is explained by the model. Higher the value of R2, the better fit the model is to the data. 67% is a high value indicating that the model is a good fit for the data.

06

Precautions while using stepwise model  

Precautions while using stepwise model -

First, an extremely large number of t-tests have been conducted, leading to a high probability of making one or more Type I or Type II errors. Second, the stepwise model does not include any higher-order or interaction terms.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

To model the relationship between y, a dependent variable, and x, an independent variable, a researcher has taken one measurement on y at each of three different x-values. Drawing on his mathematical expertise, the researcher realizes that he can fit the second-order model Ey=β0+β1x+β2x2 and it will pass exactly through all three points, yielding SSE = 0. The researcher, delighted with the excellent fit of the model, eagerly sets out to use it to make inferences. What problems will he encounter in attempting to make inferences?

Question: Workplace bullying and intention to leave. Workplace bullying has been shown to have a negative psychological effect on victims, often leading the victim to quit or resign. In Human Resource Management Journal (October 2008), researchers employed multiple regression to examine whether perceived organizational support (POS) would moderate the relationship between workplace bullying and victims’ intention to leave the firm. The dependent variable in the analysis, intention to leave (y), was measured on a quantitative scale. The two key independent variables in the study were bullying (, measured on a quantitative scale) and perceived organizational support (measured qualitatively as “low,” “neutral,” or “high”).

  1. Set up the dummy variables required to represent POS in the regression model.
  2. Write a model for E(y) as a function of bullying and POS that hypothesizes three parallel straight lines, one for each level of POS.
  3. Write a model for E(y) as a function of bullying and POS that hypothesizes three non-parallel straight lines, one for each level of POS.
  4. The researchers discovered that the effect of bullying on intention to leave was greater at the low level of POS than at the high level of POS. Which of the two models, parts b and c, support these findings?

Question: Predicting elements in aluminum alloys. Aluminum scraps that are recycled into alloys are classified into three categories: soft-drink cans, pots and pans, and automobile crank chambers. A study of how these three materials affect the metal elements present in aluminum alloys was published in Advances in Applied Physics (Vol. 1, 2013). Data on 126 production runs at an aluminum plant were used to model the percentage (y) of various elements (e.g., silver, boron, iron) that make up the aluminum alloy. Three independent variables were used in the model: x1 = proportion of aluminum scraps from cans, x2 = proportion of aluminum scraps from pots/pans, and x3 = proportion of aluminum scraps from crank chambers. The first-order model, , was fit to the data for several elements. The estimates of the model parameters (p-values in parentheses) for silver and iron are shown in the accompanying table.

(A) Is the overall model statistically useful (at α = .05) for predicting the percentage of silver in the alloy? If so, give a practical interpretation of R2.

(b)Is the overall model statistically useful (at a = .05) for predicting the percentage of iron in the alloy? If so, give a practical interpretation of R2.

(c)Based on the parameter estimates, sketch the relationship between percentage of silver (y) and proportion of aluminum scraps from cans (x1). Conduct a test to determine if this relationship is statistically significant at α = .05.

(d)Based on the parameter estimates, sketch the relationship between percentage of iron (y) and proportion of aluminum scraps from cans (x1). Conduct a test to determine if this relationship is statistically significant at α = .05.

Question: Company donations to charity. The amount a company donates to a charitable organization is often restricted by financial inflexibility at the firm. One measure of financial inflexibility is the ratio of restricted assets to total firm assets. A study published in the Journal of Management Accounting Research (Vol. 27, 2015) investigated the link between donation amount and this ratio. Data were collected on donations to 115,333 charities over a recent 10-year period, resulting in a sample of 419,225 firm-years. The researchers fit the quadratic model,E(y)=β0+β1x+β2x2, where y = natural logarithm of total donations to charity by a firm in a year and x = ratio of restricted assets to the firm’s total assets in the previous year. [Note: This model is a simplified version of the actual model fit by the researchers.]

  1. The researchers’ theory is that as a firm’s restricted assets increase, donations will initially increase. However, there is a point at which donations will not only diminish, but also decline as restricted assets increase. How should the researchers use the model to test this theory?
  2. The results of the multiple regression are shown in the table below. Use this information to test the researchers’ theory at. What do you conclude?

Suppose the mean value E(y) of a response y is related to the quantitative independent variables x1and x2

E(y)=2+x1-3x2-x1x2

a) Identify and interpret the slope forx2

b) Plot the linear relationship between E(y) andx2for role="math" localid="1649796003444" x1=0,1,2, whererole="math" localid="1649796025582" 1x23

c) How would you interpret the estimated slopes?

d) Use the lines you plotted in part b to determine the changes in E(y) for eachrole="math" localid="1649796051071" x1=0,1,2.

e) Use your graph from part b to determine how much E(y) changes whenrole="math" localid="1649796075921" 3x15androle="math" localid="1649796084395" 1x23.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free