Question: Yield strength of steel alloy. Industrial engineers at the University of Florida used regression modelling as a tool to reduce the time and cost associated with developing new metallic alloys (Modelling and Simulation in Materials Science and Engineering, Vol. 13, 2005). To illustrate, the engineers built a regression model for the tensile yield strength (y) of a new steel alloy. The potentially important predictors of yield strength are listed in the accompanying table.

a. The engineers discovered that the variable Nickel (x4) was highly correlated with the other potential independent variables. Consequently, Nickel was dropped from the model. Do you agree with this decision? Explain.

b. The engineers used stepwise regression on the remaining 10 potential independent variables in order to search for a parsimonious set of predictor variables. Do you agree with this decision? Explain.

c. The stepwise regression selected the following independent variables: x1 = Carbon, x2 = Manganese, x3 = Chromium, x5 = Molybdenum, x6 = Copper, x8 = Vanadium, x9 = Plate thickness, x10 = Solution treating, and x11 = Aging temperature. All these variables were statistically significant in the step-wise model, with R2 = .94. Consequently, the engineers used the estimated stepwise model to predict yield strength. Do you agree with this decision? Explain.

Short Answer

Expert verified

Answer

a. Since Nickel (x4) was highly correlated with other variables, it is prudent to remove the variable from the model. Otherwise, the model will not give us unbiased results due to the problem of multicollinearity.

b. When there are a lot of independent variables, it is important to only include statistically significant independent variables in the final regression model. Hence, it is always better to conduct a stepwise regression analysis and only include independent variables which are significant to fitting the data.

c The stepwise regression selected following independent variables x1 = Carbon, x2 = Manganese, x3 = Chromium, x5 = Molybdenum, x6 = Copper, x8 = Vanadium, x9 = Plate thickness, x10 = Solution treating, and x11 = Aging temperature. Here, the value of R2 is 0.94 indicating that 94% of the variation in the data can be explained using the model which shows that the model is an ideal fit for the data. The model produced by the stepwise regression method can be used for further calculation and analysis.

Step by step solution

01

Given information

The variable Nickel was highly correlated with the other potential independent variables. Consequently, Nickel was dropped from the model. The stepwise regression model is used.

02

Multicollinearity

a.

Since Nickel (x4) was highly correlated with other variables, it is prudent to remove the variable from the model. Otherwise, the model will not give us unbiased results due to the problem of multicollinearity.

03

Significance of stepwise regression

b.

When there are a lot of independent variables, it is important to only include statistically significant independent variables in the final regression model. Hence, it is always better to conduct a stepwise regression analysis and only include independent variables which are significant to fitting the data.

04

Stepwise regression 

c.

The stepwise regression selected following independent variables x1 = Carbon, x2 = Manganese, x3 = Chromium, x5 = Molybdenum, x6 = Copper, x8 = Vanadium, x9 = Plate thickness, x10 = Solution treating, and x11 = Aging temperature.

Here, the value of R2 is 0.94 indicating that 94% of the variation in the data can be explained using the model which shows that the model is an ideal fit for the data. The model produced by the stepwise regression method can be used for further calculation and analysis.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Question: Revenues of popular movies. The Internet Movie Database (www.imdb.com) monitors the gross revenues for all major motion pictures. The table on the next page gives both the domestic (United States and Canada) and international gross revenues for a sample of 25 popular movies.

  1. Write a first-order model for foreign gross revenues (y) as a function of domestic gross revenues (x).
  2. Write a second-order model for international gross revenues y as a function of domestic gross revenues x.
  3. Construct a scatterplot for these data. Which of the models from parts a and b appears to be the better choice for explaining the variation in foreign gross revenues?
  4. Fit the model of part b to the data and investigate its usefulness. Is there evidence of a curvilinear relationship between international and domestic gross revenues? Try usingα=0.05.
  5. Based on your analysis in part d, which of the models from parts a and b better explains the variation in international gross revenues? Compare your answer with your preliminary conclusion from part c.

Question: Job performance under time pressure. Time pressure is common at firms that must meet hard and fast deadlines. How do employees working in teams perform when they perceive time pressure? And, can this performance improve with a strong team leader? These were the research questions of interest in a study published in the Academy of Management Journal (October, 2015). Data were collected on n = 139 project teams working for a software company in India. Among the many variables recorded were team performance (y, measured on a 7-point scale), perceived time pressure (, measured on a 7-point scale), and whether or not the team had a strong and effective team leader (x2 = 1 if yes, 0 if no). The researchers hypothesized a curvilinear relationship between team performance (y) and perceived time pressure (), with different-shaped curves depending on whether or not the team had an effective leader (x2). A model for E(y) that supports this theory is the complete second-order model:E(y)=β0+β1x1+β2x12+β3x2+β4x1x2+β5x12x2

a. Write the equation for E(y) as a function of x1 when the team leader is not effective (x2= 0).

b. Write the equation for E(y) as a function ofwhen the team leader is effective (x2= 1).

c. The researchers reported the following b-estimates:.

β0^=4.5,β1^=0.13,β3^=0.15,β4^=0.15andβ5^=0.29Use these estimates to sketch the two equations, parts a and b. What is the nature of the curvilinear relationship when the team leaders is not effective? Effective?

Explain why stepwise regression is used. What is its value in the model-building process?

Question: Orange juice demand study. A chilled orange juice warehousing operation in New York City was experiencing too many out-of-stock situations with its 96-ounce containers. To better understand current and future demand for this product, the company examined the last 40 days of sales, which are shown in the table below. One of the company’s objectives is to model demand, y, as a function of sale day, x (where x = 1, 2, 3, c, 40).

  1. Construct a scatterplot for these data.
  2. Does it appear that a second-order model might better explain the variation in demand than a first-order model? Explain.
  3. Fit a first-order model to these data.
  4. Fit a second-order model to these data.
  5. Compare the results in parts c and d and decide which model better explains variation in demand. Justify your choice.

Service workers and customer relations. A study in Industrial Marketing Management (February 2016) investigated the impact of service workers’ (e.g., waiters and waitresses) personal resources on the quality of the firm’s relationship with customers. The study focused on four types of personal resources: flexibility in dealing with customers(x1), service worker reputation(x2), empathy for the customer(x3), and service worker’s task alignment(x4). A multiple regression model was employed used to relate these four independent variables to relationship quality (y). Data were collected for n = 220 customers who had recent dealings with a service worker. (All variables were measured on a quantitative scale, based on responses to a questionnaire.)

a) Write a first-order model for E(y) as a function of the four independent variables. Refer to part

Which β coefficient measures the effect of flexibility(x1)on relationship quality (y), independently of the other

b) independent variables in the model?

c) Repeat part b for reputation(x2), empathy(x3), and task alignment(x4).

d) The researchers theorize that task alignment(x4)“moderates” the effect of each of the other x’s on relationship quality (y) — that is, the impact of eachx, x1,x2, orx3on y depends on(x4). Write an interaction model for E(y) that matches the researchers’ theory.

e) Refer to part d. What null hypothesis would you test to determine if the effect of flexibility(x1)on relationship quality (y) depends on task alignment(x4)?

f) Repeat part e for the effect of reputation(x2)and the effect of empathy(x3).

g) None of the t-tests for interaction were found to be “statistically significant”. Given these results, the researchers concluded that their theory was not supported. Do you agree?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free