F-Test for Linear Constraints
Theorem
If \(\text{rank}(X) = p\), \(\text{rank}(R) = q\), and \(\varepsilon \sim N(0, \sigma^2 I_n)\), then under \(H_0: R\beta = 0\):
\[F = \frac{n-p}{q} \cdot \frac{SSR_c - SSR}{SSR} \sim F(q, n-p)\]
where:
- \(SSR\): sum of squares of residuals in the unconstrained model
- \(SSR_c\): sum of squares of residuals in the constrained model satisfying \(R\beta = 0\)
Proof
Define
- \(V_0 = \{X\beta : R\beta = 0\} = X(\text{Ker}(R))\) (constrained subspace)
Since \(\text{rank}(X) = p\) and \(\text{rank}(R) = q\), it holds that \(\dim(\text{Ker}(R)) = p - q\).
Since \(X\) has full column rank, the \(\beta \mapsto X\beta\) is injective Therefore: \(\dim(V_0) = \dim(X(\text{Ker}(R))) = \dim(\text{Ker}(R)) = p - q\).
We have that
- \(SSR = \|P_{[X]^\perp} Y\|^2\)
- \(SSR_c = \|P_{V_0^\perp}Y\|^2\)
Let us now decompose
\[ Y = P_{[X]^\perp} Y + (P_{V_0^\perp} - P_{[X]^\perp})Y \]
The first term correspond to the residuals, and let us now analyse the second term. For this we use the following useful Lemma.
Hence, projecting the above decomposition on \(V_0^\perp\), we get
\[ P_{V_0^\perp}Y = P_{[X]^\perp} Y + (P_{V_0^\perp} - P_{[X]^\perp})Y \]
Here, \(V_0 \subset [X]\) so that \([X]^\perp \subset V_0^{\perp}\), and we check that
\((P_{V_0^\perp} - P_{[X]^\perp})^2 = (P_{V_0^\perp} - P_{[X]^\perp})= (P_{V_0^\perp} - P_{[X]^\perp})^T\)
so that \(P_{V_0^\perp} - P_{[X]^\perp}\) is an orthogonal projector on a subspace of dimension \(tr(V_0^\perp) - tr(P_{[X]^\perp}) = n-(p-q)-(n-p) = q\)
Since \(P_{[X]^\perp}(P_{V_0^\perp} - P_{[X]^\perp}) = P_{[X]^\perp}P_{[X]^\perp}-P_{[X]^\perp}P_{[X]^\perp} = 0\), we also have that \(P_{[X]^\perp} Y\) and \((P_{V_0^\perp} - P_{[X]^\perp})Y\) are orthogonal.
Hence, by the Cochran Theorem, \(P_{[X]^\perp} Y\) and \((P_{V_0^\perp} - P_{[X]^\perp})Y\) are indpendent and
- \(SSR=\|P_{[X]^\perp} Y\|^2 = \|P_{[X]^\perp} \varepsilon\|^2\) has distribution \(\chi^2(n-p)\)
- Under \(H_0\), \(X\beta \in V_0\) so that \(SSR_c=\|P_{V_0^\perp}Y\|^2 = P_{V_0^\perp}(X\beta + \varepsilon) = P_{V_0^\perp}(\varepsilon)\)
- Hence, sill under \(H_0\), \(SSR_c -SSR = \|(P_{V_0^\perp} - P_{[X]^\perp}) \varepsilon\|^2\) has distribution \(\chi^2(q)\)
- \(SSR\) and \(SSR_c - SSR\) are independent since they are projections on orthogonal subspaces of \(\varepsilon \sim \mathcal N(0, \sigma^2I_n)\)