Exercises 19 and 20 involve a design matrix \(X\) with two or more columns and a least-squares solution \(\hat \beta \) of \({\bf{y}} = X\beta \). Consider the following numbers.

(i) \({\left\| {X\hat \beta } \right\|^2}\)—the sum of the squares of the “regression term.” Denote this number by .

(ii) \({\left\| {{\bf{y}} - X\hat \beta } \right\|^2}\)—the sum of the squares for error term. Denote this number by \(SS\left( E \right)\).

(iii) \({\left\| {\bf{y}} \right\|^2}\)—the “total” sum of the squares of the \(y\)-values. Denote this number by \(SS\left( T \right)\).

Every statistics text that discusses regression and the linear model \(y = X\beta + \in \) introduces these numbers, though terminology and notation vary somewhat. To simplify matters, assume that the mean of the -values is zero. In this case, \(SS\left( T \right)\) is proportional to what is called the variance of the set of -values.

19. Justify the equation \(SS\left( T \right) = SS\left( R \right) + SS\left( E \right)\). (Hint: Use a theorem, and explain why the hypotheses of the theorem are satisfied.) This equation is extremely important in statistics, both in regression theory and in the analysis of variance.

Short Answer

Expert verified

The equation \(SS\left( T \right) = SS\left( R \right) + SS\left( E \right)\) is justified.

Step by step solution

01

Find \(SS\left( T \right)\)

The given residual vector is \( \in = {\bf{y}} - X\hat \beta \) which is orthogonal to \(\text{Col}X\), while \({\bf{\hat y}} = X\hat \beta \) is in \({\rm{Col}}X\).

As, \( \in = {\bf{y}} - X\hat \beta \) and \({\bf{\hat y}} = X\hat \beta \) are orthogonal, apply the orthogonal theorem and find \(\).

\(\begin{aligned}SS\left( T \right) &= {\left\| {\bf{y}} \right\|^2}\\ &= {\left\| {{\bf{\hat y}} + \in } \right\|^2}\\ &= {\left\| {{\bf{\hat y}}} \right\|^2} + {\left\| \in \right\|^2}\end{aligned}\)

Use \({\bf{\hat y}} = X\hat \beta \) and \( \in = {\bf{y}} - X\hat \beta \) into the obtained expression.

\(SS\left( T \right) = {\left\| {X\hat \beta } \right\|^2} + {\left\| {{\bf{y}} - X\hat \beta } \right\|^2}{\rm{ }}\left( 1 \right)\)

02

Find \(SS\left( R \right) + SS\left( E \right)\)

Find\(SS\left( R \right) + SS\left( E \right)\).

\(SS\left( R \right) + SS\left( E \right) = {\left\| {X\hat \beta } \right\|^2} + {\left\| {{\bf{y}} - X\hat \beta } \right\|^2}{\rm{ }}\left( 2 \right)\)

From equations (1) and (2),

\(SS\left( T \right) = SS\left( R \right) + SS\left( E \right)\)

Hence, the equation \(SS\left( T \right) = SS\left( R \right) + SS\left( E \right)\) is justified.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

In exercises 1-6, determine which sets of vectors are orthogonal.

  1. \(\left[ {\begin{array}{*{20}{c}}{ - 1}\\4\\{ - 3}\end{array}} \right]\), \(\left[ {\begin{array}{*{20}{c}}5\\2\\1\end{array}} \right]\), \(\left[ {\begin{array}{*{20}{c}}3\\{ - 4}\\{ - 7}\end{array}} \right]\)

Suppose \(A = QR\), where \(R\) is an invertible matrix. Showthat \(A\) and \(Q\) have the same column space.

A healthy child’s systolic blood pressure (in millimetres of mercury) and weight (in pounds) are approximately related by the equation

\({\beta _0} + {\beta _1}\ln w = p\)

Use the following experimental data to estimate the systolic blood pressure of healthy child weighing 100 pounds.

\(\begin{array} w&\\ & {44}&{61}&{81}&{113}&{131} \\ \hline {\ln w}&\\vline & {3.78}&{4.11}&{4.39}&{4.73}&{4.88} \\ \hline p&\\vline & {91}&{98}&{103}&{110}&{112} \end{array}\)

Let \(X\) be the design matrix used to find the least square line of fit data \(\left( {{x_1},{y_1}} \right), \ldots ,\left( {{x_n},{y_n}} \right)\). Use a theorem in Section 6.5 to show that the normal equations have a unique solution if and only if the data include at least two data points with different \(x\)-coordinates.

In Exercises 17 and 18, all vectors and subspaces are in \({\mathbb{R}^n}\). Mark each statement True or False. Justify each answer.

a. If \(W = {\rm{span}}\left\{ {{x_1},{x_2},{x_3}} \right\}\) with \({x_1},{x_2},{x_3}\) linearly independent,

and if \(\left\{ {{v_1},{v_2},{v_3}} \right\}\) is an orthogonal set in \(W\) , then \(\left\{ {{v_1},{v_2},{v_3}} \right\}\) is a basis for \(W\) .

b. If \(x\) is not in a subspace \(W\) , then \(x - {\rm{pro}}{{\rm{j}}_W}x\) is not zero.

c. In a \(QR\) factorization, say \(A = QR\) (when \(A\) has linearly

independent columns), the columns of \(Q\) form an

orthonormal basis for the column space of \(A\).

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free