Session 4: Symmetric Positive Definiteness and Cholesky Factorization

Symmetric Positive Definite Matrices

For real-symmetric matrices $A \in \mathbb{R}^{n \times n}$, the following statements are equivalent, and any one can serve as the definition of a positive definite matrix.

Showing a Positive Quadratic Form

For a vector $\mathbf{x} \in \mathbb{R}^n$ and a matrix $A \in \mathbb{R}^{n \times n}$, the scalar function defined by

$$ f(\mathbf{x}) = \mathbf{x}^T A \mathbf{x} = \sum_{i=0}^{n-1} \sum_{j=0}^{n-1} a_{ij}x_i x_j $$

is called a quadratic form. A quadratic form is said to be positive definite whenever $A$ is a positive definite matrix. That is, the above quadratic is positive definite if and only if $f(\mathbf{x}) > 0$ for all $\mathbf{0} \neq \mathbf{x} \in \mathbb{R}^n$.

**Note**: Not all symmetric matrices lead to a nonnegative quadratic form!

Showing the $LU$ Factorization for SPD Matrices

The $LDU$ Factorization

There's some assymetry in an $LU$ factorization because the lower factor has $1$s on its diagonal while the upper factor has a nonunit diagonal. This is easily remedied by factoring the diagonal entries out of the upper factor as shown below:

$$ U = \begin{pmatrix} u_{0,0} & u_{0,1} & \cdots & u_{0,n-1} \\ 0 & u_{1,1} & \cdots & u_{1,n-1} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & u_{n-1,n-1} \end{pmatrix} = \begin{pmatrix} u_{0,0} & 0 & \cdots & 0 \\ 0 & u_{1,1} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & u_{n-1,n-1} \end{pmatrix} \begin{pmatrix} 1 & u_{0,1}/u_{0,0} & \cdots & u_{0,n-1}/u_{0,0} \\ 0 & 1 & \cdots & u_{1,n-1}/u_{1,1} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \end{pmatrix}. $$




If you know that $A = LU$ is an SPD matrix as above, **what python statements would you write to get $D$ and the new $U$?**

If you know that $A$ is symmetric, **what's the relation between $L$ and the new $U$?**

When $A$ is symmetric, $LDU$ factorization is $A = LDL^T$.

Showing the Eigenvalues of an SPD Matrix

If $A \in \mathbb{R}^{n \times n}$ is an SPD matrix and $(\lambda, \mathbf{x})$ is an eigenpair (i.e., $A\mathbf{x} = \lambda\mathbf{x}$ for $\mathbf{0} \neq \mathbf{x} \in \mathbb{R}^n$), then

$$ \lambda = \frac{\mathbf{x}^T A\mathbf{x}}{\mathbf{x}^T\mathbf{x}} = \frac{\mathbf{x}^T A\mathbf{x}}{||\mathbf{x}||_2^2} > 0 $$

since the quadratic form $\mathbf{x}^T A\mathbf{x} > 0$ and $||\mathbf{x}||_2^2 > 0$. Thus, all the eigenvalues of $A$ are positive.

Showing the Leading Principal Minor Determinants

The pivots in the $LU$ factorization of any square matrix are intimately related to the leading principal minor determinants. If $A_k$ is the $k$th leading principal submatrix of $A$, then the $k$th pivot is given by

$$ u_{k,k} = \left\{ \begin{array}{ll} \mathrm{det}(A_0) = a_{0,0} & \mathrm{for~}k = 0, \\ \mathrm{det}(A_k)/\mathrm{det}(A_{k-1}) & \mathrm{for~}k = 1, 2, ..., n-1 \end{array}\right. $$

Consequently, a symmetric matrix is positive definite if and only if each of its leading principal minors is positive.

Cholesky Factorization

We have seen above that a symmetric matrix $A$ possessing an $LU$ factorization in which each is pivot is positive is said to be positive definite.

More precisely $A$ is positive definite if and only if $A$ can be uniquely factored as $A = R^T R$, where $R$ is an upper-triangular matrix with positive diagonal entries. This is known as the Choleski factorization of $A$, and $R$ is called the Cholesky factor of $A$.

As we showed above, if $A$ is symmetric, it has an $LDU$ factorization $A = LDL^T$, in which $D = \mathrm{diag}( u_{0,0}, u_{1,1}, ..., u_{n-1,n-1} )$ is the diagonal matrix containing the pivots $u_{k,k}$. Since $A$ is SPD, we have that $u_{k,k} > 0$ for all $k$. Thus, setting $R = D^{1/2}L^T$ where $D^{1/2} = \mathrm{diag}( \sqrt{u_{0,0}}, \sqrt{u_{1,1}}, ..., \sqrt{u_{n-1,n-1}} )$ yields the desired factorization because

$$ A = L D^{1/2} D^{1/2} L^T = R^T R, $$

and $R$ is upper triangular with positive diagonal entries.

We can find the Cholesky factor $R$ as above, but computing the $LU$ decomposition of $A$ is expensive. Instead, let's take advantage of the symmetry of $A$ and the upper-triangular structure in $R$!

You need to write the more efficient factorization function in HW4!

Observations for the Cholesky Factorization

Let's unfold the product $A = R^T R$, for a $4$-by-$4$ matrix $A$:

$$ \underbrace{\begin{pmatrix} a_{00} & a_{01} & a_{02} & a_{03} \\ & a_{11} & a_{12} & a_{13} \\ & & a_{22} & a_{23} \\ & & & a_{33} \\ \end{pmatrix}}_{\large A} = \underbrace{\begin{pmatrix} r_{00} & & & \\ r_{01} & r_{11} & & \\ r_{02} & r_{12} & r_{22} & \\ r_{03} & r_{13} & r_{23} & r_{33} \\ \end{pmatrix}}_{\large R^T} ~~ \underbrace{\begin{pmatrix} r_{00} & r_{01} & r_{02} & r_{03} \\ & r_{11} & r_{12} & r_{13} \\ & & r_{22} & r_{23} \\ & & & r_{33} \\ \end{pmatrix}}_{\large R}, $$

where the empty space below the main diagonal in $A$ mirrors the upper triangular part of it, and the empty space in $R^T$ and $R$ is full of zeros.

Now, let's see how we can compute each distinct entry $a_{ij} \in A$:

$$ \begin{align*} a_{00}&=r_{00}r_{00} & a_{11}&=r_{01}r_{01}+r_{11}r_{11} & a_{22}&=r_{02}r_{02}+r_{12}r_{12}+r_{22}r_{22} & a_{33}&=r_{03}r_{03}+r_{13}r_{13}+r_{23}r_{23}+r_{33}r_{33} \\ a_{01}&=r_{00}r_{01} & a_{12}&=r_{01}r_{02}+r_{11}r_{12} & a_{23}&=r_{02}r_{03}+r_{12}r_{13}+r_{22}r_{23} & ~ &~ \\ a_{02}&=r_{00}r_{02} & a_{13}&=r_{01}r_{03}+r_{11}r_{13} & ~ &~ & ~ &~ \\ a_{03}&=r_{00}r_{03} & ~ &~ & ~ &~ & ~ &~ \\ \end{align*} $$

Note that the above sums don't touch the lower triangular portion of $A$ nor the lower triangular portion of $R$.

We need to solve for $r_{ij}$, where $j >= i$:

$$ \begin{align*} r_{00}&=\sqrt{a_{00}} & r_{11}&=\sqrt{a_{11}-r_{01}r_{01}} & r_{22}&=\sqrt{a_{22}-r_{02}r_{02}-r_{12}r_{12}} & r_{33}&=\sqrt{a_{33}-r_{03}r_{03}-r_{13}r_{13}-r_{23}r_{23}} \\ r_{01}&=a_{01}/r_{00} & r_{12}&=(a_{12}-r_{01}r_{02})/r_{11} & r_{23}&=(a_{23}-r_{02}r_{03}-r_{12}r_{13})/r_{22} & ~ &~ \\ r_{02}&=a_{02}/r_{00} & r_{13}&=(a_{13}-r_{01}r_{03})/r_{11} & ~ &~ & ~ &~ \\ r_{03}&=a_{03}/r_{00} & ~ &~ & ~ &~ & ~ &~ \\ \end{align*} $$

If we start off with $R$ being equal to the the upper triangular portion of $A$, can you see how we find the $0$th row of $R$? What happens to the (sub) rows below? Can you express that update with Python's slice notation?