F-Test for Linear Constraints

Theorem

If \(\text{rank}(X) = p\), \(\text{rank}(R) = q\), and \(\varepsilon \sim N(0, \sigma^2 I_n)\), then under \(H_0: R\beta = 0\):

\[F = \frac{n-p}{q} \cdot \frac{SSR_c - SSR}{SSR} \sim F(q, n-p)\]

where:

  • \(SSR\): sum of squares of residuals in the unconstrained model
  • \(SSR_c\): sum of squares of residuals in the constrained model satisfying \(R\beta = 0\)

Proof

Define

  • \(V_0 = \{X\beta : R\beta = 0\} = X(\text{Ker}(R))\) (constrained subspace)

Since \(\text{rank}(X) = p\) and \(\text{rank}(R) = q\), it holds that \(\dim(\text{Ker}(R)) = p - q\).

Since \(X\) has full column rank, the \(\beta \mapsto X\beta\) is injective Therefore: \(\dim(V_0) = \dim(X(\text{Ker}(R))) = \dim(\text{Ker}(R)) = p - q\).

We have that

  • \(SSR = \|P_{[X]^\perp} Y\|^2\)
  • \(SSR_c = \|P_{V_0^\perp}Y\|^2\)

Let us now decompose

\[ Y = P_{[X]^\perp} Y + (P_{V_0^\perp} - P_{[X]^\perp})Y \]

The first term correspond to the residuals, and let us now analyse the second term. For this we use the following useful Lemma.

Useful Lemma

If \(E\) \(F\) are two subspace of \(\mathbb R^n\) such that \(E \subset F\), and if \(P_E\), \(P_F\) are the orthogonal projectors on \(E\) and \(F\), then

\(P_EP_F = P_FP_E = P_E\).

Hence, projecting the above decomposition on \(V_0^\perp\), we get

\[ P_{V_0^\perp}Y = P_{[X]^\perp} Y + (P_{V_0^\perp} - P_{[X]^\perp})Y \]

Here, \(V_0 \subset [X]\) so that \([X]^\perp \subset V_0^{\perp}\), and we check that

\((P_{V_0^\perp} - P_{[X]^\perp})^2 = (P_{V_0^\perp} - P_{[X]^\perp})= (P_{V_0^\perp} - P_{[X]^\perp})^T\)

so that \(P_{V_0^\perp} - P_{[X]^\perp}\) is an orthogonal projector on a subspace of dimension \(tr(V_0^\perp) - tr(P_{[X]^\perp}) = n-(p-q)-(n-p) = q\)

Since \(P_{[X]^\perp}(P_{V_0^\perp} - P_{[X]^\perp}) = P_{[X]^\perp}P_{[X]^\perp}-P_{[X]^\perp}P_{[X]^\perp} = 0\), we also have that \(P_{[X]^\perp} Y\) and \((P_{V_0^\perp} - P_{[X]^\perp})Y\) are orthogonal.

Hence, by the Cochran Theorem, \(P_{[X]^\perp} Y\) and \((P_{V_0^\perp} - P_{[X]^\perp})Y\) are indpendent and

  • \(SSR=\|P_{[X]^\perp} Y\|^2 = \|P_{[X]^\perp} \varepsilon\|^2\) has distribution \(\chi^2(n-p)\)
  • Under \(H_0\), \(X\beta \in V_0\) so that \(SSR_c=\|P_{V_0^\perp}Y\|^2 = P_{V_0^\perp}(X\beta + \varepsilon) = P_{V_0^\perp}(\varepsilon)\)
  • Hence, sill under \(H_0\), \(SSR_c -SSR = \|(P_{V_0^\perp} - P_{[X]^\perp}) \varepsilon\|^2\) has distribution \(\chi^2(q)\)
  • \(SSR\) and \(SSR_c - SSR\) are independent since they are projections on orthogonal subspaces of \(\varepsilon \sim \mathcal N(0, \sigma^2I_n)\)