Variational Characterization of Eigenvalues

For real, symmetric matrices in {\mathbb{R}^{n\times n}}, the spectral decomposition theorem states that there exist eigenvalues {\lambda_{1},\ldots,\lambda_{n}\in\mathbb{R}} with corresponding eigenvectors {v_{1},\ldots,v_{n}\in\mathbb{R}^{n}} such that {\left\{ v_{1},\ldots,v_{n}\right\} } is an orthonormal basis for {\mathbb{R}^{n}.} In this post, we will provide a proof of the spectral theorem for real symmetric matrices and continue to develop a variational characterization of {\lambda_{1},\ldots,\lambda_{n}.} In our exposition, we will freely use the isomorphism between linear functions over {\mathbb{R}^{n}} and matrices {\mathbb{R}^{n\times n},} depending on which representation is more convenient.

Lemma 1. Let {M\in\mathbb{R}^{n\times n}} be an {n\times n} matrix. The eigenvalues of {M} are then precisely the roots {\lambda} of the characteristic polynomial {\det(M-\lambda I).}

Proof. By definition, {\lambda} is an eigenvalue if and only if there exists a non-zero vector {v\in\mathbb{R}^{n}} such that {Mv=\lambda v.} Thus, {(M-\lambda I)v=0.} There is a nonzero {v} that satisfies this relation if and only if {\det(M-\lambda I)=0.} Thus, the eigenvalues of {M} precisely coincide with the roots of the characteristic polynomial {\det(M-\lambda I).} For an {n\times n} matrix, the characteristic polynomial has degree {n}, so {M} has exactly {n} eigenvalues (not necessarily distinct) over the complex numbers.

Lemma 2. Let {M\in\mathbb{R}^{n\times n}} be a symmetric {n\times n} matrix. Then, all eigenvalues of {M} are real.

Proof. Let {\lambda} be an eigenvalue of {A} with associated eigenvector {v.} Then, we have

\displaystyle \left\langle v,Mv\right\rangle =\left\langle v,\lambda v\right\rangle =\bar{\lambda}\left\langle v,v\right\rangle =\bar{\lambda}\left\Vert v\right\Vert _{2}^{2},

where {\bar{\lambda}} denotes the complex conjugate of {\lambda.} Next, since {M} is real and symmetric, and thus Hermitian, we have

\displaystyle \left\langle v,Mv\right\rangle =\left\langle Mv,v\right\rangle =\left\langle \lambda v,v\right\rangle =\lambda\left\langle v,v\right\rangle =\lambda\left\Vert v\right\Vert _{2}^{2}.

Thus, {\bar{\lambda}\left\Vert v\right\Vert _{2}^{2}=\lambda\left\Vert v\right\Vert _{2}^{2}} and since {v\ne0,} it follows that {\lambda=\bar{\lambda},} so {\lambda} is real.

Lemma 3. Let {M\in\mathbb{R}^{n\times n}} and let {\lambda} be an eigenvalue of {M} with associated (normalized) eigenvector {v.} Then, for all {u\perp v,} {Mu\perp v.}

Proof. We have

\displaystyle \left\langle Mu,v\right\rangle =\left\langle u,Mv\right\rangle =\lambda\left\langle u,v\right\rangle =0.

Theorem 4 (Spectral Theorem). Let {M\in\mathbb{R}^{n\times n}} be a symmetric {n\times n} matrix. Then, there exist eigenvalues {\lambda_{1},\ldots,\lambda_{n}} of {M} with corresponding eigenvectors {\{v_{1},\ldots,v_{n}\}} such that {\{v_{1},\ldots,v_{n}\}} constitutes an orthonormal basis for {\mathbb{R}^{n}.}

Proof. It will be easier to work with linear transformations. Let {T:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}} be the linear transformation that corresponds to {M.} Take {\lambda_{1}} to be an eigenvalue of {T} and let {v_{1}\in\mathbb{R}^{n+1}} be the associated (normalized) eigenvector. Let {W} be the orthogonal complement of {\mathrm{span}\{v_{1}\}} in {\mathbb{R}^{n}.} Consider the restricted map {\left.T\right|_{W}.} By Lemma 3, {\left.T\right|_{W}} is a linear operator on {W.} Since {\left.T\right|_{W}} is the restriction map, eigenvalues of {\left.T\right|_{W}} are also eigenvalues of {T.} Thus, let {\lambda_{2}} be an eigenvalue of {\left.T\right|_{W}} with associated eigenvector {v_{2}.} By construction {v_{2}\in W} so {v_{2}\perp v_{1}.} Now, we can repeat this process by defining {W'} to be the orthogonal complement of {\mathrm{span}\left\{ v_{1},v_{2}\right\} } and considering {\left.T\right|_{W'}.} Repeating this process yields eigenvalues {\lambda_{1},\ldots,\lambda_{n}} with an orthonormal set of eigenvectors {\left\{ v_{1},\ldots,v_{n}\right\} }, which must necessarily constitute a basis for {\mathbb{R}^{n}.}

Theorem 5 (Variational Characterization). Let {M\in\mathbb{R}^{n\times n}} be a symmetric {n\times n} matrix with eigenvalues {\lambda_{1}\ge\lambda_{2}\ge\cdots\ge\lambda_{n}.} Then, for all {1\le k\le n,}

\displaystyle \lambda_{k}=\max_{\substack{U\subseteq\mathbb{R}^{n}\\  \dim(U)=k  }  }\min_{\substack{x\in U\\  \left\Vert x\right\Vert _{2}=1  }  }x^{T}Mx.

Proof. By the spectral theorem, there exist an orthonormal set of eigenvectors {v_{1},\ldots,v_{n}} corresponding to {\lambda_{1},\ldots,\lambda_{n}.} Then, write {x} as a linear combination of {v_{1},\ldots,v_{n}}: {x=\sum_{i}\alpha_{i}v_{i}.} Then,

\displaystyle   x^{T}Mx=\left\langle x,Mx\right\rangle =\left\langle \sum_{i=1}^{n}\alpha_{i}v_{i},\sum_{i=1}^{n}\alpha_{i}\lambda_{i}v_{i}\right\rangle =\sum_{i=1}^{n}\alpha_{i}^{2}\lambda_{i},

since {v_{i}\perp v_{j}} for all {i\ne j.} It is not hard to see that

\displaystyle   \lambda_{k}\le\max_{\substack{U\subseteq\mathbb{R}^{n}\\  \dim(U)=k  }  }\min_{\substack{x\in U\\  \left\Vert x\right\Vert _{2}=1  }  }x^{T}Mx=\max_{\substack{U\subseteq\mathbb{R}^{n}\\  \dim(U)=k  }  }\min_{\substack{x\in U\\  \left\Vert x\right\Vert _{2}=1  }  }\sum_{i=1}^{n}\alpha_{i}^{2}\lambda_{i},

with {x=\sum_{i}\alpha_{i}v_{i}.} To see this, take {U=\mathrm{span}\left\{ v_{1},\ldots,v_{k}\right\}.} By construction, for all {x\in U}, {\alpha_{k+1},\ldots,\alpha_{n}=0.} Since {\lambda_{1}\ge\cdots\ge\lambda_{k}}, to minimize {\sum_{i}\alpha_{i}^{2}\lambda_{i}} subject to {\sum_{i}\alpha_{i}^{2}=1}, we would simply choose {\alpha_{1},\ldots,\alpha_{k-1}=0} and {\alpha_{k}=1.} In this case, the objective value is precisely {\lambda_{k}.} Since we are maximizing over all subspaces {U} with dimension {k,} {\lambda_{k}} is upper-bounded by the objective. Next, we show that {\lambda_{k}} is lower-bounded by the objective. To see this, take any subspace {U} of dimension {k.} Since {\left\{ \lambda_{1},\ldots,\lambda_{k-1}\right\} } span a subspace of dimension {k-1}, there is some {x\in U} such that {x\perp\left\{ \lambda_{1},\ldots,\lambda_{k-1}\right\}.} Thus, when we minimize over {x\in U}, there is some {x} such that

\displaystyle   \sum_{i=1}^{n}\alpha_{i}^{2}\lambda_{i}=\sum_{i=k}^{n}\alpha_{i}^{2}\lambda_{i}\le\lambda_{k},

and so the objective is lower-bounded by {\lambda_{k}}. Thus, we conclude that

\displaystyle   \lambda_{k}=\max_{\substack{U\subseteq\mathbb{R}^{n}\\  \dim(U)=k  }  }\min_{\substack{x\in U\\  \left\Vert x\right\Vert _{2}=1  }  }x^{T}Mx.

Advertisements