Question: Improving Math SAT scores. Refer to the Chance (Winter 2001) study of students who paid a private tutor (or coach) to help them improve their Scholastic Assessment Test (SAT) scores, Exercise 2.88 (p. 113). Multiple regression was used to estimate the effect of coaching on SAT–Mathematics scores. Data on 3,492 students (573 of whom were coached) were used to fit the model Ey=β0+β1x1+β2x2, where y = SAT-Math score, x1 score on PSAT, and x2{1 if the student was coached, 0 if not}.

  1. The fitted model had an adjusted R2 value of .76. Interpret this result.
  2. The estimate ofβ2 in the model was 19, with a standard error of 3. Use this information to form a confidence interval forβ2 . Interpret the interval.
  3. Based on the interval, part b, what can you say about the effect of coaching on SAT–Math scores?
  4. As an alternative model, the researcher added several “control” variables, including dummy variables for student ethnicity (x3,x4 and x5 ), a socioeconomic status index variable (x6) , two variables that measured high school performance (x7 and x8) , the number of math courses taken in high school (x9) , and the overall GPA for the math courses (x10) . Write the hypothesized equation for E(y) for the alternative model.
  5. Give the null hypothesis for a nested model F-test comparing the initial and alternative models.
  6. The nested model F-test, part e, was statistically significant at . Practically interpret this result.
  7. The alternative model, part d, resulted inRa2=0.79,β^2=14andsβ^2=3 . Interpret the value of R2a .
  8. Refer to part g. Find and interpret a confidence interval for .
  9. The researcher concluded that “the estimated effect of SAT coaching decreases from the baseline model when control variables are added to the model.” Do you agree? Justify your answer.
  10. As a modification to the model of part d, the researcher added all possible interactions between the coaching variable (x2) and the other independent variables in the model. Write the equation for E(y) for this modified model.
  11. Give the null hypothesis for comparing the models, parts d and j. How would you perform this test?

Short Answer

Expert verified

Answer

  1. The value of R2 in this question is 0.76, meaning that 76% of the variation in the data is explained by the model, indicating that the model is a good fit for the data.
  2. confidence interval forβ2is (13.12, 24.88).
  3. Since the confidence interval is a positive interval, it indicates that coached students have scored higher in SAT-Math.
  4. The equation for E(y) for the alternate model can be written as Ey=β0+β1x1+β2x2+β3x3+β4x4+β5x5+β6x6+β7x7+β8x8+β9x9+β10x10.
  5. The null hypothesis can be written as H0:β3=β4=β5=β6=β7=β8=β9=β10while At least one of the parameters under test is non-zero.
  6. It is given in the question that the nested model F-test is statistically significant at indicatingα=0.05that the alternate model is a better fit for the data.
  7. The alternate model has a R2avalue of 0.79, indicating that 79% of the variation in the data is explained by the model making the model a good fit.
  8. The confidence interval for is (9.12, 20.88).
  9. The researcher’s conclusion that the estimated effect of SAT coaching decreases from the baseline model when a control variable is added to the model is incorrect. The alternate model was statistically significant, indicating that the added variables fit better.
  10. The equation for E(y) for an alternate model with interaction can be written aslocalid="1662028198607" Ey=β0+β1x1+β2x2+β3x3+β4x4+β5x5+β6x6+β7x7+β8x8+β9x9+β10x10+β11x2x1+β12x2x3+β13x2x4+β14x2x5+β15x2x6+β16x2x7+β17x2x8+β18x2x9+β19x2x10
  11. To compare the two models, an F-test is conducted where the null and alternate hypothesis are H0:β11=β12=β13=β14=β15=β16=β17=β18=β19=0while Ha : At least one of the parameters under test is non-zero.

Step by step solution

01

(a) Interpretation of  R2

The value of R2 represents the fraction of the sample variance

of the y-values (measured by SSyy ) that is explained by the least squares prediction equation.

R2 And R2a have similar interpretations. However, unlike R2,Ra2 takes into account (“adjusts” for) both the sample size n and the number of b parameters in the model.

The value of R2 in this question is 0.76, meaning that 76% of the variation in the data is explained by the model, indicating that the model is a good fit for the data.

02

(b) Confidence interval for  β2

The confidence interval for β2 ­can be written as β^2±t0.025,3491×sβ^2

Therefore, the confidence interval for β2 is 19±1.96×3

95% confidence interval for β2 is (13.12, 24.88).

03

(c) Effect of coaching on SAT-Math score

Since the confidence interval is positive, it indicates that coached students have scored higher in SAT-Math.

04

(d) Model equation for E(y)

The equation for E(y) for the alternate model can be written as

Ey=β0+β1x1+β2x2+β3x3+β4x4+β5x5+β6x6+β7x7+β8x8+β9x9+β10x10

.

05

(e) Null and alternate hypothesis

At least one of the parameters under test is non-zero.

H0:β3=β4=β5=β6=β7=β8=β9=β10

06

(f) Evaluation of nested model F-test

It is given in the question that the nested model F-test is statistically significant at α=0.05 indicating that the alternate model is a better fit for the data.

07

(g) Analysis of  Ra2

The alternate model has aRa2 value of 0.79, indicating that 79% of the variation in the data is explained by the model making the model a good fit.

08

(h) Simplification for  β2

The confidence interval for β2 ­can be written as β^2±t0.025,3491×sβ^2

Therefore, the confidence interval for β2 is

95% confidence interval for β2 is (9.12, 20.88).

09

(i) Significance of nested models

The researcher’s conclusion that the estimated effect of SAT coaching decreases from the baseline model when control variables are added to the model is not correct. The alternate model was statistically significant, indicating that the added variables fit better.

10

(j) Model equation with interaction terms

The equation for E(y) for the alternate model with interaction can be written as

Ey=β0+β1x1+β2x2+β3x3+β4x4+β5x5+β6x6+β7x7+β8x8+β9x9+β10x10+β11x2x1+β12x2x3+β13x2x4+β14x2x5+β15x2x6+β16x2x7+β17x2x8+β18x2x9+β19x2x10

11

(k) Comparison of the nested model

To compare the two models, an F-test is conducted where the null and alternate hypotheses are H0:β11=β12=β13=β14=β15=β16=β17=β18=β19=0while

Ha: At least one of the parameters under test is non-zero.

Where, F-teststatistic=SSEn-k+1.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Question: Shopping on Black Friday. Refer to the International Journal of Retail and Distribution Management (Vol. 39, 2011) study of shopping on Black Friday (the day after Thanksgiving), Exercise 6.16 (p. 340). Recall that researchers conducted interviews with a sample of 38 women shopping on Black Friday to gauge their shopping habits. Two of the variables measured for each shopper were age (x) and number of years shopping on Black Friday (y). Data on these two variables for the 38 shoppers are listed in the accompanying table.

  1. Fit the quadratic model, E(y)=β0+β1x+β2x2, to the data using statistical software. Give the prediction equation.
  2. Conduct a test of the overall adequacy of the model. Use α=0.01.
  3. Conduct a test to determine if the relationship between age (x) and number of years shopping on Black Friday (y) is best represented by a linear or quadratic function. Use α=0.01.

Question: Job performance under time pressure. Time pressure is common at firms that must meet hard and fast deadlines. How do employees working in teams perform when they perceive time pressure? And, can this performance improve with a strong team leader? These were the research questions of interest in a study published in the Academy of Management Journal (October, 2015). Data were collected on n = 139 project teams working for a software company in India. Among the many variables recorded were team performance (y, measured on a 7-point scale), perceived time pressure (, measured on a 7-point scale), and whether or not the team had a strong and effective team leader (x2 = 1 if yes, 0 if no). The researchers hypothesized a curvilinear relationship between team performance (y) and perceived time pressure (), with different-shaped curves depending on whether or not the team had an effective leader (x2). A model for E(y) that supports this theory is the complete second-order model:E(y)=β0+β1x1+β2x12+β3x2+β4x1x2+β5x12x2

a. Write the equation for E(y) as a function of x1 when the team leader is not effective (x2= 0).

b. Write the equation for E(y) as a function ofwhen the team leader is effective (x2= 1).

c. The researchers reported the following b-estimates:.

β0^=4.5,β1^=0.13,β3^=0.15,β4^=0.15andβ5^=0.29Use these estimates to sketch the two equations, parts a and b. What is the nature of the curvilinear relationship when the team leaders is not effective? Effective?

Question: There are six independent variables, x1, x2, x3, x4, x5, and x6, that might be useful in predicting a response y. A total of n = 50 observations is available, and it is decided to employ stepwise regression to help in selecting the independent variables that appear to be useful. The software fits all possible one-variable models of the form

E(Y)=β0+β1xi

where xi is the ith independent variable, i = 1, 2, …, 6. The information in the table is provided from the computer printout.

a. Which independent variable is declared the best one variable predictor of y? Explain.

b. Would this variable be included in the model at this stage? Explain.

c. Describe the next phase that a stepwise procedure would execute.

Consider relating E(y) to two quantitative independent variables x1 and x2.

  1. Write a first-order model for E(y).

  2. Write a complete second-order model for E(y).

Suppose you fit the quadratic model E(y)=β0+β1x+β2x2to a set of n = 20 data points and found R2=0.91, SSyy=29.94, and SSE = 2.63.

a. Is there sufficient evidence to indicate that the model contributes information for predicting y? Test using a = .05.

b. What null and alternative hypotheses would you test to determine whether upward curvature exists?

c. What null and alternative hypotheses would you test to determine whether downward curvature exists?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free