Software millionaires and birthdays. Refer to Exercise 11.23 (p. 655) and the study of software millionaires and their birthdays. The data are reproduced on p. 663.

a. Find SSE s2and s for the simple linear regression model relating the number (y) of software millionaire birthdays in a decade to the total number (x) of U.S. births.

b. Find SSE s2and s for the simple linear regression model relating the number (y) of software millionaire birthdays in a decade to the number (x) of CEO birthdays.

c. Which of the two models' fit will have smaller errors of prediction? Why?

Decade

Total U.S. Births (millions)

Number of Software Millionaire Birthdays

Number of CEO Birthdays (in a random sample of 70 companies from the Fortune 500 list)

1920

28.582

3

2

1930

24.374

1

2

1940

31.666

10

23

1950

40.530

14

38

1960

38.808

7

9

1970

33.309

4

0

Short Answer

Expert verified
  1. SSE is 34051.540643, s2 8512.88516075, and s is 92.2653.
  2. SSE is 2296.75, s2 1148.375, and s is 33.8877.
  3. Model 2.

Step by step solution

01

Introduction

The standard error of the mean is the most commonly reported form of standard error (SE or SEM). The standard error may also be used to find other statistics such as medians and proportions. The standard error is a measure that assesses the difference between a population parameter and a sample statistic.

02

Calculate SSE s2 and s via the number (y) of software millionaire birthdays in a decade to the number (x) of U.S Births

\(\begin{aligned}S{S_{xy}} &= \sum {{x_i}} {y_i} - \frac{{\sum {{x_i}} \sum {{y_i}} }}{n}\\ &= 139.092 - \frac{{197.269 x 78}}{6}\\ &= 139.092 - \frac{{15,386.982}}{6}\\ &= \frac{{139.092 - 15,386.982}}{6}\end{aligned}\)

\(\begin{aligned} &= - \frac{{15,247.89}}{6}\\ &= 2,541.315\end{aligned}\)

\(\begin{aligned}S{S_{xx}} &= {\sum {{x_i}} ^2} - \frac{{{{\left( {\sum {{x_i}} } \right)}^2}}}{n}\\ &= 6671.989401 - \frac{{{{\left( {197.269} \right)}^2}}}{6}\\ &= 6671.989401 - \frac{{38915.058361}}{6}\\ &= \frac{{40,031.936406 - 38915.058361}}{6}\end{aligned}\)

\(\begin{aligned} &= \frac{{1116.878045}}{6}\\ &= 186.1463\end{aligned}\)

\(\begin{aligned}\widehat {{\beta _1}} &= \frac{{S{S_{xy}}}}{{S{S_{xx}}}}\\ &= \frac{{2541.315}}{{186.1463}}\\ &= 13.6522\end{aligned}\)

\(\begin{aligned}S{S_{yy}} &= \sum {y^2} - \frac{{{{\left( {\sum y} \right)}^2}}}{n}\\ &= 371 - \frac{{{{\left( {78} \right)}^2}}}{6}\\ &= 371 - \frac{{6084}}{{12}}\\ &= \frac{{2226 - 6084}}{6}\end{aligned}\)

\(\begin{aligned} &= - \frac{{3858}}{6}\\ &= - 643\end{aligned}\)

\(\begin{aligned}SSE &= S{S_{yy}} - \widehat {{\beta _1}}S{S_{xy}}\\ &= 643 - (13.6522)(2541.315)\\ &= 643 - 34694.540643\\ &= 34051.540643\end{aligned}\)

\(\begin{aligned} {s^2} &= \frac{{SSE}}{{n - 2}}\\ &= \frac{{34051.540643}}{{6 - 2}}\\ &= \frac{{34051.540643}}{4}\\ &= 8512.88516075\end{aligned}\)

\(\begin{aligned}s &= \sqrt {{s^2}} \\ &= \sqrt {8512.88516075} \\ &= 92.2653\end{aligned}\)

Therefore, the value of SSE is 34051.540643, s2 8512.88516075, and s is 92.2653.

03

Calculate SSE s2  and s via the number (y) of software millionaire birthdays in a decade to the number (x) of CEO birthdays

\(\begin{aligned}S{S_{xy}} &= \sum {{x_i}} {y_i} - \frac{{\sum {{x_i}} \sum {{y_i}} }}{n}\\ &= 1646 - \frac{{74 x 39}}{6}\\ &= 1646 - \frac{{2886}}{6}\\ &= \frac{{9876 - 2886}}{6}\end{aligned}\)

\(\begin{aligned} &= \frac{{6990}}{6}\\ &= 1665\end{aligned}\)

\(\begin{aligned}S{S_{xx}} &= {\sum {{x_i}} ^2} - \frac{{{{\left( {\sum {{x_i}} } \right)}^2}}}{n}\\ &= 2062 - \frac{{{{\left( {74} \right)}^2}}}{6}\\ &= 2062 - \frac{{5476}}{6}\\ &= \frac{{12372 - 5476}}{6}\end{aligned}\)

\(\begin{aligned} &= \frac{{6896}}{6}\\ &= 1149.4\end{aligned}\)

\(\begin{aligned}\widehat {{\beta _1}} &= \frac{{S{S_{xy}}}}{{S{S_{xx}}}}\\ &= \frac{{1665}}{{1149.4}}\\ &= 1.45\end{aligned}\)

\(\begin{aligned}S{S_{yy}} &= \sum {y^2} - \frac{{{{\left( {\sum y} \right)}^2}}}{n}\\ &= 371 - \frac{{{{\left( {39} \right)}^2}}}{6}\\ &= 371 - \frac{{1521}}{6}\\ &= \frac{{2226 - 1521}}{6}\end{aligned}\)

\(\begin{aligned} &= \frac{{705}}{6}\\ &= 117.5\end{aligned}\)

\(\begin{aligned}SSE &= S{S_{yy}} - \widehat {{\beta _1}}S{S_{xy}}\\ &= 117.5 - (1.45)(1665)\\ &= 117.5 - 2414.25\\ &= 2296.75\end{aligned}\)

\(\begin{aligned} {s^2} &= \frac{{SSE}}{{n - 2}}\\ &= \frac{{2296.75}}{{4 - 2}}\\ &= \frac{{2296.75}}{2}\\ &= 1148.375\end{aligned}\)

\(\begin{aligned}s &= \sqrt {{s^2}} \\ &= \sqrt {1148.375} \\ &= 33.8877\end{aligned}\)

Therefore, the value of SSE is 2296.75, s2 1148.375,and s is 33.8877.

04

The two models' fits will have smaller prediction errors

From steps 2 and 3, it is clear that Model 2 fits, as it has a smaller prediction error.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Refer to Exercise 11.3. Find the equations of the lines that pass through the points listed in Exercise 11.1.

Congress voting on women’s issues. The American Economic Review (March 2008) published research on how the gender mix of a U.S. legislator’s children can influence the legislator’s votes in Congress. The American Association of University Women (AAUW) uses voting records of each member of Congress to compute an AAUW score, where higher scores indicate more favorable voting for women’s rights. The researcher wants to use the number of daughters a legislator has to predict the legislator’s AAUW score.

a. In this study, identify the dependent and independent variables.

b. Explain why a probabilistic model is more appropriate than a deterministic model.

c. Write the equation of the straight-line, probabilistic model.

Visually compare the scatter plots shown below. If a least squares line were determined for each data set, which do you think would have the smallest variance s2? Explain.

Time in bankruptcy. Financially distressed firms can gain protection from their creditors while they restructure by filing for protection under U.S. Bankruptcy Codes. In a prepackaged bankruptcy, a firm negotiates a reorganization plan with its creditors prior to filing for bankruptcy. This can result in a much quicker exit from bankruptcy than traditional bankruptcy filings. A study of 49 prepackaged bankruptcies was published in Financial Management (Spring 1995). For each firm, information was collected on the time (in months) in bankruptcy as well as the results of the board of directors’ vote on the type of reorganization plan. Three types of plans were studied: “Joint”—a joint exchange offer with prepackaged bankruptcy solicitation; “Prepack”—prepackaged bankruptcy solicitation only; and “None”—no pre-filing vote held. The data for the 49 firms is provided in the accompanying table

a. Construct a stem-and-leaf display for the length of time in bankruptcy for all 49 companies.

b. Summarize the information reflected in the stem-and-leaf display from part a. Make a general statement about the length of time in bankruptcy for firms using “prepacks.”

c. Select a graphical method that will permit a comparison of the time-in-bankruptcy distributions for the three types or reorganization plans.

d. Firms that were reorganized through a leveraged buyout are identified by an asterisk in the table. Mark these firms on the stem-and-leaf display, part a, by circling their bankruptcy times. Do you observe any pattern in the graph? Explain

Best-paid CEOs. Refer to Glassdoor Economic Research firm’s 2015 ranking of the 40 best-paid CEOs in Table 2.1 (p. 65). Recall that data were collected on a CEO’s age and ratio of salary to a typical worker’s pay at the firm. One objective is to predict the ratio of salary to worker pay based on the CEO’s age.

a. In this study, identify the dependent and independent variables.

b. Explain why a probabilistic model is more appropriate than a deterministic model.

c. Write the equation of the straight-line, probabilistic model.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free