# Schur Decomposition of Real Symmetric Matrix Using the SVD of its Off-Diagonal Submatrices

This is one of the problems I think summarizes most of my efforts in my studies of a first course in Matrix Analysis and Computations.  I hope this is useful for someone who needs any hint on a similar problem.

# Problem

Let $$\mathbf{A}$$ be a real (possibly non-square) matrix.  Let

$$\mathbf{B} = \begin{pmatrix} \mathbf{0} & \mathbf{A}^T \\ \mathbf{A} & \mathbf{0} \end{pmatrix}$$

Show that $$\mathbf{B}$$ is a real symmetric matrix. Show that the Schur decomposition of $$\mathbf{B}$$ can be written in terms of the SVD of $$\mathbf{A}$$. Hint: You can find a permutation $$\Pi$$ such that

$$\Pi \begin{pmatrix} \mathbf{0} & \Sigma^T \\ \Sigma & \mathbf{0} \end{pmatrix} \Pi^T$$

is a block diagonal matrix with each block of size $$2 \times 2$$ at most.

# Solution

Suppose $$\mathbf{A} \in \mathbb{R}^{m \times n}$$, with $$m,n \geq 1$$, and possibly $$m \neq n$$.  Note that $$\mathbf{B}^T = \mathbf{B}$$, and so $$\mathbf{B}$$ is an $$(m + n) \times (m + n)$$ real  symmetric matrix.  Thus, we are guaranteed to find a Schur decomposition of the form $$\mathbf{B} = \mathbf{Q} \Lambda \mathbf{Q}^T$$, where $$\Lambda$$ is a diagonal real matrix, and $$\mathbf{Q}\mathbf{Q}^T = \mathbf{Q}^T\mathbf{Q} = \mathbf{I}$$.

Now, let $$\mathbf{A} = \mathbf{U} \Sigma \mathbf{V}^T$$ be the SVD of $$\mathbf{A}$$, with $$\mathbf{U}\mathbf{U}^T = \mathbf{U}^T\mathbf{U} = \mathbf{I}$$, $$\mathbf{V}\mathbf{V}^T = \mathbf{V}^T\mathbf{V} = \mathbf{I}$$, and $$\Sigma$$ containing the singular values of $$\mathbf{A}$$, $$\sigma_i$$, $$1 \leq I \leq \min{(m, n)}$$, along the diagonal.  Then, the factorization of $$\mathbf{B}$$ in terms of $$\mathbf{U}$$, $$\mathbf{V}$$, and $$\Sigma$$ is the following:

$$\mathbf{B} = \begin{pmatrix} \mathbf{0} & \mathbf{A}^T \\ \mathbf{A} & \mathbf{0} \end{pmatrix} = \begin{pmatrix} \mathbf{0} & \mathbf{V}\Sigma^T\mathbf{U}^T \\ \mathbf{U}\Sigma\mathbf{V}^T & \mathbf{0} \end{pmatrix} = \begin{pmatrix} \mathbf{V} & \mathbf{0} \\ \mathbf{0} & \mathbf{U} \end{pmatrix} \begin{pmatrix} \mathbf{0} & \Sigma^T \\ \Sigma & \mathbf{0} \end{pmatrix} \begin{pmatrix} \mathbf{V}^T & \mathbf{0} \\ \mathbf{0} & \mathbf{U}^T \end{pmatrix} = \mathbf{Q}_1 \mathbf{C} \mathbf{Q}_1^T$$

where $$\mathbf{Q}_1 = \begin{pmatrix} \mathbf{V} & \mathbf{0} \\ \mathbf{0} & \mathbf{U} \end{pmatrix}$$ and $$\mathbf{C} = \begin{pmatrix} \mathbf{0} & \Sigma^T \\ \Sigma & \mathbf{0} \end{pmatrix}$$.  We can readily verify that $$\mathbf{Q}_1$$ is orthogonal since $$\mathbf{Q}_1\mathbf{Q}_1^T = \mathbf{Q}_1^T\mathbf{Q}_1 = \mathbf{I}$$.

However, $$\mathbf{C}$$ has not the expected diagonal structure.  Instead, the singular values, $$\sigma_i$$, of $$\mathbf{A}$$ are distributed in the following way:

$$\mathbf{C} = \begin{pmatrix} \mathbf{0} & \Sigma^T \\ \Sigma & \mathbf{0} \\ \end{pmatrix} = \left( \begin{array}{ccc|ccc} & & & \sigma_1 & & \\ & \mathbf{0} & & & \sigma_2 & \\ & & & & & \ddots \\ \hline \sigma_1 & & & & & \\ & \sigma_2 & & & \mathbf{0} & \\ & & \ddots & & & \end{array} \right)$$

That is, $$\mathbf{C}$$ contains each of the singular values of $$\mathbf{A}$$ twice, along two equidistant diagonals off its main diagonal.  We are going to leverage this structure to diagonalize $$\mathbf{C}$$.  For this, define $$\Pi$$ as a permutation matrix such that

$$\Pi \mathbf{C} \Pi = \begin{pmatrix} \mathbf{S}_1 & & & & \\ & \mathbf{S}_2 & & & \\ & & \ddots & & \\ & & & \mathbf{S}_l & \\ & & & &\mathbf{0} \\ \end{pmatrix}$$

is a block diagonal matrix where $$\mathbf{S}_j = \begin{pmatrix} 0 & \sigma_j \\ \sigma_j & 0 \end{pmatrix} \in \mathbb{R}^{2\times2}$$, $$l = \min{(m, n)}$$, and $$1 \leq j \leq l$$. Hence, diagonalizing $$\mathbf{C}$$ is equivalent to diagonalizing each of the $$\mathbf{S}_j$$ blocks. For this purpose, notice that $$\mathbf{S}_j$$ has two eigenvalues (found from its characteristic equation $$(\mathbf{S}_j – \lambda\mathbf{I}) = 0$$): $$\lambda = \pm\sigma_j$$. Furthermore, the corresponding eigenvectors are $$\frac{1}{\sqrt{2}}\begin{bmatrix}1 \\ 1\end{bmatrix}$$ and $$\frac{1}{\sqrt{2}}\begin{bmatrix}1 \\ -1\end{bmatrix}$$ for $$\lambda = \sigma_j$$ and $$\lambda = -\sigma_j$$, respectively.  The diagonalization of $$\mathbf{S}_j$$ is then given by:

$$\mathbf{S}_j = \begin{pmatrix} 0 & \sigma_j \\ \sigma_j & 0 \end{pmatrix} = \left(\frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} \right) \begin{pmatrix} \sigma_j & 0 \\ 0 & -\sigma_j \end{pmatrix} \left(\frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} \right)^T = \tau \Lambda_j \tau^T$$

where $$\tau^T\tau = \tau\tau^T = \mathbf{I}$$.  This result shows that, regardless of the $$\mathbf{S}_j$$ block we deal with, the constant $$2 \times 2$$ transformation $$\tau$$ allows us to find that $$\Lambda_j = \tau^T\mathbf{S}_j\tau$$.

We can now extend the diagonalization of the individual $$\mathbf{S}_j$$ blocks to the diagonalization of $$\Pi\mathbf{C}\Pi^T$$.  Let $$\mathbf{T}$$ be the transformation matrix with $$l$$ blocks of type $$\tau$$ along its diagonal, then:

$$\mathbf{T} = \begin{pmatrix} \tau & & & \\ & \tau & & \\ & & \ddots & \\ & & & \mathbf{I} \\ \end{pmatrix} \Rightarrow \Pi\mathbf{C}\Pi^T = \mathbf{T}\Lambda\mathbf{T}^T \iff \Lambda = \mathbf{T}^T(\Pi\mathbf{C}\Pi^T)\mathbf{T}$$

where

$$\Lambda = \begin{pmatrix} \Lambda_1 & & & & \\ & \Lambda_2 & & & \\ & & \ddots & & \\ & & & \Lambda_l & \\ & & & & \mathbf{0} \\ \end{pmatrix}, \; \; \mathbf{T}\mathbf{T}^T = \mathbf{T}^T\mathbf{T} = \mathbf{I}$$

We have thus shown that $$\mathbf{C}$$ can be diagonalized after rearranging the singular values within.  Therefore,  integrating the above results into the factorization of $$\mathbf{B}$$, we obtain:

\begin{equation*} \begin{aligned} \mathbf{B} &= \mathbf{Q}_1\mathbf{C}\mathbf{Q}_1^T \\ &= \mathbf{Q}_1\Pi^T(\Pi\mathbf{C}\Pi^T)\Pi\mathbf{Q}_1^T \\ &= \mathbf{Q}_1\Pi^T\mathbf{T}(\mathbf{T}^T\Pi\mathbf{C}\Pi^T\mathbf{T})\mathbf{T}^T\Pi\mathbf{Q}_1^T \\ &= \mathbf{Q}_1\Pi^T\mathbf{T}\Lambda\mathbf{T}^T\Pi\mathbf{Q}_1^T \end{aligned} \end{equation*}

$$\Rightarrow \mathbf{B} = \mathbf{Q}\Lambda\mathbf{Q}^T, \; \; \mathbf{Q} = \mathbf{Q}_1\Pi^T\mathbf{T}, \; \; \Lambda = \mathbf{T}^T\Pi\mathbf{C}\Pi^T\mathbf{T}$$

which concludes the demonstration.