# Variational Characterization of Eigenvalues

For real, symmetric matrices in ${\mathbb{R}^{n\times n}}$, the spectral decomposition theorem states that there exist eigenvalues ${\lambda_{1},\ldots,\lambda_{n}\in\mathbb{R}}$ with corresponding eigenvectors ${v_{1},\ldots,v_{n}\in\mathbb{R}^{n}}$ such that ${\left\{ v_{1},\ldots,v_{n}\right\} }$ is an orthonormal basis for ${\mathbb{R}^{n}.}$ In this post, we will provide a proof of the spectral theorem for real symmetric matrices and continue to develop a variational characterization of ${\lambda_{1},\ldots,\lambda_{n}.}$ In our exposition, we will freely use the isomorphism between linear functions over ${\mathbb{R}^{n}}$ and matrices ${\mathbb{R}^{n\times n},}$ depending on which representation is more convenient.

Lemma 1. Let ${M\in\mathbb{R}^{n\times n}}$ be an ${n\times n}$ matrix. The eigenvalues of ${M}$ are then precisely the roots ${\lambda}$ of the characteristic polynomial ${\det(M-\lambda I).}$

Proof. By definition, ${\lambda}$ is an eigenvalue if and only if there exists a non-zero vector ${v\in\mathbb{R}^{n}}$ such that ${Mv=\lambda v.}$ Thus, ${(M-\lambda I)v=0.}$ There is a nonzero ${v}$ that satisfies this relation if and only if ${\det(M-\lambda I)=0.}$ Thus, the eigenvalues of ${M}$ precisely coincide with the roots of the characteristic polynomial ${\det(M-\lambda I).}$ For an ${n\times n}$ matrix, the characteristic polynomial has degree ${n}$, so ${M}$ has exactly ${n}$ eigenvalues (not necessarily distinct) over the complex numbers.

Lemma 2. Let ${M\in\mathbb{R}^{n\times n}}$ be a symmetric ${n\times n}$ matrix. Then, all eigenvalues of ${M}$ are real.

Proof. Let ${\lambda}$ be an eigenvalue of ${A}$ with associated eigenvector ${v.}$ Then, we have

$\displaystyle \left\langle v,Mv\right\rangle =\left\langle v,\lambda v\right\rangle =\bar{\lambda}\left\langle v,v\right\rangle =\bar{\lambda}\left\Vert v\right\Vert _{2}^{2},$

where ${\bar{\lambda}}$ denotes the complex conjugate of ${\lambda.}$ Next, since ${M}$ is real and symmetric, and thus Hermitian, we have

$\displaystyle \left\langle v,Mv\right\rangle =\left\langle Mv,v\right\rangle =\left\langle \lambda v,v\right\rangle =\lambda\left\langle v,v\right\rangle =\lambda\left\Vert v\right\Vert _{2}^{2}.$

Thus, ${\bar{\lambda}\left\Vert v\right\Vert _{2}^{2}=\lambda\left\Vert v\right\Vert _{2}^{2}}$ and since ${v\ne0,}$ it follows that ${\lambda=\bar{\lambda},}$ so ${\lambda}$ is real.

Lemma 3. Let ${M\in\mathbb{R}^{n\times n}}$ and let ${\lambda}$ be an eigenvalue of ${M}$ with associated (normalized) eigenvector ${v.}$ Then, for all ${u\perp v,}$ ${Mu\perp v.}$

Proof. We have

$\displaystyle \left\langle Mu,v\right\rangle =\left\langle u,Mv\right\rangle =\lambda\left\langle u,v\right\rangle =0.$

Theorem 4 (Spectral Theorem). Let ${M\in\mathbb{R}^{n\times n}}$ be a symmetric ${n\times n}$ matrix. Then, there exist eigenvalues ${\lambda_{1},\ldots,\lambda_{n}}$ of ${M}$ with corresponding eigenvectors ${\{v_{1},\ldots,v_{n}\}}$ such that ${\{v_{1},\ldots,v_{n}\}}$ constitutes an orthonormal basis for ${\mathbb{R}^{n}.}$

Proof. It will be easier to work with linear transformations. Let ${T:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}}$ be the linear transformation that corresponds to ${M.}$ Take ${\lambda_{1}}$ to be an eigenvalue of ${T}$ and let ${v_{1}\in\mathbb{R}^{n+1}}$ be the associated (normalized) eigenvector. Let ${W}$ be the orthogonal complement of ${\mathrm{span}\{v_{1}\}}$ in ${\mathbb{R}^{n}.}$ Consider the restricted map ${\left.T\right|_{W}.}$ By Lemma 3, ${\left.T\right|_{W}}$ is a linear operator on ${W.}$ Since ${\left.T\right|_{W}}$ is the restriction map, eigenvalues of ${\left.T\right|_{W}}$ are also eigenvalues of ${T.}$ Thus, let ${\lambda_{2}}$ be an eigenvalue of ${\left.T\right|_{W}}$ with associated eigenvector ${v_{2}.}$ By construction ${v_{2}\in W}$ so ${v_{2}\perp v_{1}.}$ Now, we can repeat this process by defining ${W'}$ to be the orthogonal complement of ${\mathrm{span}\left\{ v_{1},v_{2}\right\} }$ and considering ${\left.T\right|_{W'}.}$ Repeating this process yields eigenvalues ${\lambda_{1},\ldots,\lambda_{n}}$ with an orthonormal set of eigenvectors ${\left\{ v_{1},\ldots,v_{n}\right\} }$, which must necessarily constitute a basis for ${\mathbb{R}^{n}.}$

Theorem 5 (Variational Characterization). Let ${M\in\mathbb{R}^{n\times n}}$ be a symmetric ${n\times n}$ matrix with eigenvalues ${\lambda_{1}\ge\lambda_{2}\ge\cdots\ge\lambda_{n}.}$ Then, for all ${1\le k\le n,}$

$\displaystyle \lambda_{k}=\max_{\substack{U\subseteq\mathbb{R}^{n}\\ \dim(U)=k } }\min_{\substack{x\in U\\ \left\Vert x\right\Vert _{2}=1 } }x^{T}Mx.$

Proof. By the spectral theorem, there exist an orthonormal set of eigenvectors ${v_{1},\ldots,v_{n}}$ corresponding to ${\lambda_{1},\ldots,\lambda_{n}.}$ Then, write ${x}$ as a linear combination of ${v_{1},\ldots,v_{n}}$: ${x=\sum_{i}\alpha_{i}v_{i}.}$ Then,

$\displaystyle x^{T}Mx=\left\langle x,Mx\right\rangle =\left\langle \sum_{i=1}^{n}\alpha_{i}v_{i},\sum_{i=1}^{n}\alpha_{i}\lambda_{i}v_{i}\right\rangle =\sum_{i=1}^{n}\alpha_{i}^{2}\lambda_{i},$

since ${v_{i}\perp v_{j}}$ for all ${i\ne j.}$ It is not hard to see that

$\displaystyle \lambda_{k}\le\max_{\substack{U\subseteq\mathbb{R}^{n}\\ \dim(U)=k } }\min_{\substack{x\in U\\ \left\Vert x\right\Vert _{2}=1 } }x^{T}Mx=\max_{\substack{U\subseteq\mathbb{R}^{n}\\ \dim(U)=k } }\min_{\substack{x\in U\\ \left\Vert x\right\Vert _{2}=1 } }\sum_{i=1}^{n}\alpha_{i}^{2}\lambda_{i},$

with ${x=\sum_{i}\alpha_{i}v_{i}.}$ To see this, take ${U=\mathrm{span}\left\{ v_{1},\ldots,v_{k}\right\}.}$ By construction, for all ${x\in U}$, ${\alpha_{k+1},\ldots,\alpha_{n}=0.}$ Since ${\lambda_{1}\ge\cdots\ge\lambda_{k}}$, to minimize ${\sum_{i}\alpha_{i}^{2}\lambda_{i}}$ subject to ${\sum_{i}\alpha_{i}^{2}=1}$, we would simply choose ${\alpha_{1},\ldots,\alpha_{k-1}=0}$ and ${\alpha_{k}=1.}$ In this case, the objective value is precisely ${\lambda_{k}.}$ Since we are maximizing over all subspaces ${U}$ with dimension ${k,}$ ${\lambda_{k}}$ is upper-bounded by the objective. Next, we show that ${\lambda_{k}}$ is lower-bounded by the objective. To see this, take any subspace ${U}$ of dimension ${k.}$ Since ${\left\{ \lambda_{1},\ldots,\lambda_{k-1}\right\} }$ span a subspace of dimension ${k-1}$, there is some ${x\in U}$ such that ${x\perp\left\{ \lambda_{1},\ldots,\lambda_{k-1}\right\}.}$ Thus, when we minimize over ${x\in U}$, there is some ${x}$ such that

$\displaystyle \sum_{i=1}^{n}\alpha_{i}^{2}\lambda_{i}=\sum_{i=k}^{n}\alpha_{i}^{2}\lambda_{i}\le\lambda_{k},$

and so the objective is lower-bounded by ${\lambda_{k}}$. Thus, we conclude that

$\displaystyle \lambda_{k}=\max_{\substack{U\subseteq\mathbb{R}^{n}\\ \dim(U)=k } }\min_{\substack{x\in U\\ \left\Vert x\right\Vert _{2}=1 } }x^{T}Mx.$