Substitute Lecture Mike O'Neil
7.3 Constrained Optimization
Recap [Quadratic form] Assume A symmetric.
Examples
x^2+y^2 (PD), x^2-y^2 (ID), x^2 (PSD), -x^2-0.5y^2 (ND)
2x^2+2xy+2y^2 eigenvectors: (1,1)/\sqrt2, (1,-1)/\sqrt2 eigenvalues 3,1
Recap [Change of Variables] to principal axes
Let A be an n\times n symmetric matrix. Then there exists an orthogonal change of variables x=Py that transforms x^TAx into y^TDy with no cross-product terms.
Example What values does x^2-2y^2 take on on the unit circle?
Terminology: Let m=min\{ x^TAx:\|x\|=1\}, M=\max\{ x^TAx:\|x\|=1\}.
Theorem 6 Let A be a symmetric matrix with m and M defined as above.
Then M is the greatest eigenvalue and m is the least eigenvalue of A (as a real number, not in magnitude).
x^TAx=M \Leftarrow x \text{EV corresponding to M}
x^TAx=m \Leftarrow x \text{EV corresponding to m}
ExampleMaximize, minimize 2x^2+2xy+2y^2 eigenvectors: (1,1), (1,-1) eigenvalues 3,1
Proof Diagonalize A=PDP^{-1}, x=Py. P^TP=I. \|x\|=\|y\|. Then x^TAx=y^TDy\le \lambda_1 y^Ty=\lambda_1 x^Tx=\lambda_1. Pick x=Pe_1, achieved. Similarly for m.
Theorem 8 Let A be a symmetric n\times n matrix with D=\mathrm{diag}(\lambda_1,\dots,\lambda_n) in order, with P=(u_1,\dots,u_n) and P^TP=I such that A=PDP^{-1}.
Then the maximum value subject to the constraints x^Tx=1, x^Tu_1=0,\dots,x^Tu_{k-1}=0 is the eigenvalue \lambda_k, attained for x=u_k.
ExampleMaximize, minimize 2x^2+2xy+2y^2 eigenvectors: (1,1), (1,-1) eigenvalues 3,1 subject to x+y=0.
Example [Parks and Rec.] Suppose cost of building x hundred miles of road and improving y acres of parks is
(Contrived--why? Negative road building?)
Utility q(x,y)=xy, indifference curve.
Renormalize to unit circle.
7.4 Singular value decomposition
Seems we can understand symmetric linear maps in terms of their effect on circles along certain axes. What about general (even rectangular) linear maps?
Example Consider
Unit circle to ellipse: (18,6) major (3,-9) minor
What can we say about \|Ax\|? Leads naturally to A^TA, which is symmetric.
Eigenvalues 360, 90, 0.
Eigenvectors v_1=(1/3,2/3,2/3), v_2=(-2/3,-1/3,2/3), v_3=(2/3,-2/3,1/3)
Maximum value of \|Ax\|^2 attained for x=v_1. What value of Ax attains the point? What's the maximum value of \|Ax\|?
Singular Values
Let A be an m\times n matrix.
Definition [Singular value] Square roots of the eigenvalues of A^TA are called singular values of A, written \sigma_i=\sqrt{\lambda_i}. Usually ordered \sigma_1\ge \cdots \ge \sigma_n.
Size A^TA? Why can I take square root? (\|Av_i\| nonnegative)
Example
- Singular values from examples above?
When is the second singular value attained? (orthogonal to v_1)
Notice: Av_1\perp Av_2!
Theorem Suppose \{v_1,\dots,v_n\} is an orthonormal basis of \mathbb R^n consisting of eigenvectors of A^T A, arranged so that the corresponding eigenvalues of A^TA satisfy \lambda_1\ge \cdots\ge \lambda_n. Also suppose that A has r nonzero eigenvalues.
Then:
\{Av_1,\dots,Av_r\} is an OB for \mathrm{Col} A.
\mathrm{rank} A=r.
Proof
Calculate (Av_i)^T(Av_j).
Av_i\ne 0\Leftrightarrow 1\le i\le r.
So Av_1,\dots,Av_n are linearly independent.
They are in \mathrm{Col} A--obviously.
They span \mathrm{Col} A because pick y\in\mathrm{Col} A, expand x in full basis.
Thus \mathrm{rank}A =r.
Remark [Numerics] Rank is hard to get numerically. Row echelon form accumulates roundoff. SVD to the rescue! (Also: effective rank) (But: in general not a simple problem.)
The Singular Value Decomposition
The previous theorem made it possible to capture the column space of A by applying A to the eigenvectors of A^TA. So: natural to project into eigenspace (of A^TA), apply stretches, land in column space.
Theorem [Singular Value Decomposition] Let A be an m\times n matrix with rank r.
Then there exists an m\times n matrix \Sigma which is zero outside an r\times r non-zero diagonal subblock D, where D=\mathrm{diag}(\sigma_1, \dots,\sigma_r).
There also exists an m\times m orthogonal matrix U
There also exists an n\times n orthogonal matrix V
such that
A=U\Sigma V^T
Remark [Uniqueness] Anything that looks like this is called an SVD. \Sigma unique, U, V not.
Definition [Singular vectors]
Columns of U: left singular vectors
Columns of V: right singular vectors
============END OF CLASS=============
Proof
Let u_i=Av_i normalized (i=1,\cdots,r)
- Extend to ONB.
Create orthogonal matrices U and V, sizes match.
Check that U\Sigma=AV. (V invertible!)
Example Create the SVD from the example.
Example 2 Find an SVD of \begin{pmatrix} 1 & 1\\ 0 & 1\\ -1 & 1\end{pmatrix}. A^TA=diag(2,3).
Example 3 Find an SVD of \begin{pmatrix} 1 & -1\\-2 & 2 \\ 2 & -2\end{pmatrix}. A^TA=[9,-9;-9;9]. Eigenvalues are 18,0.
Remark [Numerics] Using A^TA amplifies errors. Not used in real-world matrix code.
Applications
Definition [Condition number] Digit loss through base-10 cancellation. \sigma_1/\sigma_n
Remark [Connections of fundamental spaces] Start with:
Col A is spanned by u_i (Thm 9)
(Col A)^\perp=Nul A^T is spanned by u_i (i>r) (Thm 3, 6.1)
Nul A is spanned by v_i (i>r)
Row A=(Nul A)^\perp is spanned by v_i
Then:
A maps Row A to Col A=Row A^T
- A maps Nul A to 0
Nul A^T is left over
Theorem [Invertible Matrix] Let A be an n\times n matrix. All of the following are equivalent:
(\mathrm{Col} A)^\perp=\{0\}
(\mathrm{Nul} A)^\perp=\mathbb{R}^n
Row A=\mathbb{R}^n.
A has n nonzero singular values.
Definition [Reduced SVD] (can chop zeros)
Question: How to invert the SVD?
Definition [Moore-Penrose Pseudoinverse]
Remark [Least squares solutions] Consider an MP \hat x solution. Then A\hat x=U_rU_r^Tb is the orthogonal projection onto the column space of A, and thus a least-squares solution (Thm 10, Sec. 6.3).
