F-Test for Linear Constraints

Theorem

If \(\text{rank}(X) = p\), \(\text{rank}(R) = q\), and \(\varepsilon \sim N(0, \sigma^2 I_n)\), then under \(H_0: R\beta = 0\):

\[F = \frac{n-p}{q} \cdot \frac{SSR_c - SSR}{SSR} \sim F(q, n-p)\]

where:

  • \(SSR\): sum of squares of residuals in the unconstrained model
  • \(SSR_c\): sum of squares of residuals in the constrained model satisfying \(R\beta = 0\)

Proof

1. Setup and Key Spaces

Define the following spaces:

  • \(V = [X] \subseteq \mathbb{R}^n\) (column space of \(X\))
  • \(V_0 = \{X\beta : R\beta = 0\} = X(\text{Ker}(R))\) (constrained subspace)

Since \(\text{rank}(X) = p\) and \(\text{rank}(R) = q\):

  • By rank-nullity theorem: \(\dim(\text{Ker}(R)) = p - q\)
  • Since \(X\) has full column rank, the map \(\beta \mapsto X\beta\) is injective
  • Therefore: \(\dim(V_0) = \dim(X(\text{Ker}(R))) = \dim(\text{Ker}(R)) = p - q\)
  • Also: \(\dim(V) = p\)

2. Orthogonal Decomposition

Since \(V_0 \subseteq V\), we can decompose \(V\) as: \[V = V_0 \oplus V_0^{\perp_V}\]

where \(V_0^{\perp_V}\) is the orthogonal complement of \(V_0\) within \(V\), with: \[\dim(V_0^{\perp_V}) = \dim(V) - \dim(V_0) = p - (p-q) = q\]

3. Projection Operators

  • \(\hat{y} = P_V y\) is the projection of \(y\) onto \(V\) (unconstrained fit)
  • \(\hat{y}_c = P_{V_0} y\) is the projection of \(y\) onto \(V_0\) (constrained fit)

4. Decomposition of SSR Difference

Since \(V_0 \subseteq V\), we have \(\hat{y}_c \in V\), and by the Pythagorean theorem: \[\|y - \hat{y}_c\|^2 = \|y - \hat{y}\|^2 + \|\hat{y} - \hat{y}_c\|^2\]

This gives us: \[SSR_c = SSR + \|\hat{y} - \hat{y}_c\|^2\]

Since \(\hat{y} - \hat{y}_c = P_V y - P_{V_0} y = P_{V_0^{\perp_V}} y\): \[SSR_c - SSR = \|P_{V_0^{\perp_V}} y\|^2\]

5. Distribution Under \(H_0\)

Under \(H_0: R\beta = 0\), we have \(\mathbb{E}[y] = X\beta \in V_0\), which implies: - \(P_{V_0} \mathbb{E}[y] = X\beta\) - \(P_{V_0^{\perp_V}} \mathbb{E}[y] = 0\)

Since \(y = X\beta + \varepsilon\) with \(\varepsilon \sim N(0, \sigma^2 I_n)\):

For the numerator: \[P_{V_0^{\perp_V}} y = P_{V_0^{\perp_V}} \varepsilon \sim N(0, \sigma^2 P_{V_0^{\perp_V}})\]

Since \(P_{V_0^{\perp_V}}\) is a projection onto a \(q\)-dimensional space: \[\frac{1}{\sigma^2}\|P_{V_0^{\perp_V}} y\|^2 = \frac{SSR_c - SSR}{\sigma^2} \sim \chi^2_q\]

For the denominator: \[P_{V^{\perp}} y = P_{V^{\perp}} \varepsilon \sim N(0, \sigma^2 P_{V^{\perp}})\]

Since \(P_{V^{\perp}}\) is a projection onto an \((n-p)\)-dimensional space: \[\frac{1}{\sigma^2}\|P_{V^{\perp}} y\|^2 = \frac{SSR}{\sigma^2} \sim \chi^2_{n-p}\]

6. Independence

The projections \(P_{V_0^{\perp_V}}\) and \(P_{V^{\perp}}\) are orthogonal because:

  • \(V_0^{\perp_V} \subseteq V\)
  • \(V \perp V^{\perp}\)
  • Therefore \(V_0^{\perp_V} \perp V^{\perp}\)

This implies \(P_{V_0^{\perp_V}} \varepsilon\) and \(P_{V^{\perp}} \varepsilon\) are independent.

7. Final Result

The F-statistic is: \[F = \frac{(SSR_c - SSR)/q}{SSR/(n-p)} = \frac{\chi^2_q/q}{\chi^2_{n-p}/(n-p)}\]

Since this is the ratio of two independent chi-squared random variables divided by their respective degrees of freedom, we have: \[F \sim F(q, n-p)\]

Geometric Interpretation

The F-statistic measures the relative magnitude of:

  • The projection onto \(V_0^{\perp_V}\) (the constraint violation space within the model)
  • The projection onto \(V^{\perp}\) (the residual space)

Under \(H_0\), both projections capture only noise, leading to the F-distribution. Large values of \(F\) suggest the constraint \(R\beta = 0\) is violated.