Schur Decomposition of Real Symmetric Matrix Using the SVD of its Off-Diagonal Submatrices

This is one of the problems I think summarizes most of my efforts in my studies of a first course in Matrix Analysis and Computations.  I hope this is useful for someone who needs any hint on a similar problem.

Problem

Let \(\mathbf{A}\) be a real (possibly non-square) matrix.  Let 

$$ \mathbf{B} = \begin{pmatrix} \mathbf{0} & \mathbf{A}^T \\ \mathbf{A} & \mathbf{0} \end{pmatrix}$$

Show that \(\mathbf{B}\) is a real symmetric matrix. Show that the Schur decomposition of \(\mathbf{B}\) can be written in terms of the SVD of \(\mathbf{A}\). Hint: You can find a permutation \(\Pi\) such that

$$ \Pi \begin{pmatrix} \mathbf{0} & \Sigma^T \\ \Sigma & \mathbf{0} \end{pmatrix} \Pi^T $$

is a block diagonal matrix with each block of size \(2 \times 2\) at most.

Solution

Suppose \(\mathbf{A} \in \mathbb{R}^{m \times n}\), with \(m,n \geq 1\), and possibly \(m \neq n\).  Note that \(\mathbf{B}^T = \mathbf{B}\), and so \(\mathbf{B}\) is an \((m + n) \times (m + n)\) real  symmetric matrix.  Thus, we are guaranteed to find a Schur decomposition of the form \(\mathbf{B} = \mathbf{Q} \Lambda \mathbf{Q}^T\), where \(\Lambda\) is a diagonal real matrix, and \(\mathbf{Q}\mathbf{Q}^T = \mathbf{Q}^T\mathbf{Q} = \mathbf{I}\).

Now, let \(\mathbf{A} = \mathbf{U} \Sigma \mathbf{V}^T\) be the SVD of \(\mathbf{A}\), with \(\mathbf{U}\mathbf{U}^T = \mathbf{U}^T\mathbf{U} = \mathbf{I}\), \(\mathbf{V}\mathbf{V}^T = \mathbf{V}^T\mathbf{V} = \mathbf{I}\), and \(\Sigma\) containing the singular values of \(\mathbf{A}\), \(\sigma_i\), \(1 \leq I \leq \min{(m, n)}\), along the diagonal.  Then, the factorization of \(\mathbf{B}\) in terms of \(\mathbf{U}\), \(\mathbf{V}\), and \(\Sigma\) is the following:

$$\mathbf{B} = \begin{pmatrix} \mathbf{0} & \mathbf{A}^T \\ \mathbf{A} & \mathbf{0} \end{pmatrix} = \begin{pmatrix} \mathbf{0} & \mathbf{V}\Sigma^T\mathbf{U}^T \\ \mathbf{U}\Sigma\mathbf{V}^T & \mathbf{0} \end{pmatrix} = \begin{pmatrix} \mathbf{V} & \mathbf{0} \\ \mathbf{0} & \mathbf{U} \end{pmatrix} \begin{pmatrix} \mathbf{0} & \Sigma^T \\ \Sigma & \mathbf{0} \end{pmatrix} \begin{pmatrix} \mathbf{V}^T & \mathbf{0} \\ \mathbf{0} & \mathbf{U}^T \end{pmatrix} = \mathbf{Q}_1 \mathbf{C} \mathbf{Q}_1^T$$

where \(\mathbf{Q}_1 = \begin{pmatrix} \mathbf{V} & \mathbf{0} \\ \mathbf{0} & \mathbf{U} \end{pmatrix}\) and \(\mathbf{C} = \begin{pmatrix} \mathbf{0} & \Sigma^T \\ \Sigma & \mathbf{0} \end{pmatrix}\).  We can readily verify that \(\mathbf{Q}_1\) is orthogonal since \(\mathbf{Q}_1\mathbf{Q}_1^T = \mathbf{Q}_1^T\mathbf{Q}_1 = \mathbf{I}\).

However, \(\mathbf{C}\) has not the expected diagonal structure.  Instead, the singular values, \(\sigma_i\), of \(\mathbf{A}\) are distributed in the following way:

$$\mathbf{C} = \begin{pmatrix} \mathbf{0} & \Sigma^T \\ \Sigma & \mathbf{0} \\ \end{pmatrix} = \left( \begin{array}{ccc|ccc} & & & \sigma_1 & & \\ & \mathbf{0} & & & \sigma_2 & \\ & & & & & \ddots \\ \hline \sigma_1 & & & & & \\ & \sigma_2 & & & \mathbf{0} & \\ & & \ddots & & & \end{array} \right)$$

That is, \(\mathbf{C}\) contains each of the singular values of \(\mathbf{A}\) twice, along two equidistant diagonals off its main diagonal.  We are going to leverage this structure to diagonalize \(\mathbf{C}\).  For this, define \(\Pi\) as a permutation matrix such that

$$\Pi \mathbf{C} \Pi = \begin{pmatrix} \mathbf{S}_1 & & & & \\ & \mathbf{S}_2 & & & \\ & & \ddots & & \\ & & & \mathbf{S}_l & \\ & & & &\mathbf{0} \\ \end{pmatrix}$$

is a block diagonal matrix where \(\mathbf{S}_j = \begin{pmatrix} 0 & \sigma_j \\ \sigma_j & 0 \end{pmatrix} \in \mathbb{R}^{2\times2}\), \(l = \min{(m, n)}\), and \(1 \leq j \leq l\). Hence, diagonalizing \(\mathbf{C}\) is equivalent to diagonalizing each of the \(\mathbf{S}_j\) blocks. For this purpose, notice that \(\mathbf{S}_j\) has two eigenvalues (found from its characteristic equation \((\mathbf{S}_j – \lambda\mathbf{I}) = 0\)): \(\lambda = \pm\sigma_j\). Furthermore, the corresponding eigenvectors are \(\frac{1}{\sqrt{2}}\begin{bmatrix}1 \\ 1\end{bmatrix}\) and \(\frac{1}{\sqrt{2}}\begin{bmatrix}1 \\ -1\end{bmatrix}\) for \(\lambda = \sigma_j\) and \(\lambda = -\sigma_j\), respectively.  The diagonalization of \(\mathbf{S}_j\) is then given by:

$$\mathbf{S}_j = \begin{pmatrix} 0 & \sigma_j \\ \sigma_j & 0 \end{pmatrix} = \left(\frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} \right) \begin{pmatrix} \sigma_j & 0 \\ 0 & -\sigma_j \end{pmatrix} \left(\frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix} \right)^T = \tau \Lambda_j \tau^T$$

where \(\tau^T\tau = \tau\tau^T = \mathbf{I}\).  This result shows that, regardless of the \(\mathbf{S}_j\) block we deal with, the constant \(2 \times 2\) transformation \(\tau\) allows us to find that \(\Lambda_j = \tau^T\mathbf{S}_j\tau\).

We can now extend the diagonalization of the individual \(\mathbf{S}_j\) blocks to the diagonalization of \(\Pi\mathbf{C}\Pi^T\).  Let \(\mathbf{T}\) be the transformation matrix with \(l\) blocks of type \(\tau\) along its diagonal, then:

$$\mathbf{T} = \begin{pmatrix} \tau & & & \\ & \tau & & \\ & & \ddots & \\ & & & \mathbf{I} \\  \end{pmatrix} \Rightarrow \Pi\mathbf{C}\Pi^T = \mathbf{T}\Lambda\mathbf{T}^T \iff \Lambda = \mathbf{T}^T(\Pi\mathbf{C}\Pi^T)\mathbf{T}$$

where

$$\Lambda = \begin{pmatrix} \Lambda_1 & & & & \\ & \Lambda_2 & & & \\ & & \ddots & & \\ & & & \Lambda_l & \\ & & & & \mathbf{0} \\ \end{pmatrix}, \; \; \mathbf{T}\mathbf{T}^T = \mathbf{T}^T\mathbf{T} = \mathbf{I} $$

We have thus shown that \(\mathbf{C}\) can be diagonalized after rearranging the singular values within.  Therefore,  integrating the above results into the factorization of \(\mathbf{B}\), we obtain:

$$ \begin{equation*} \begin{aligned} \mathbf{B} &= \mathbf{Q}_1\mathbf{C}\mathbf{Q}_1^T \\ &= \mathbf{Q}_1\Pi^T(\Pi\mathbf{C}\Pi^T)\Pi\mathbf{Q}_1^T \\ &= \mathbf{Q}_1\Pi^T\mathbf{T}(\mathbf{T}^T\Pi\mathbf{C}\Pi^T\mathbf{T})\mathbf{T}^T\Pi\mathbf{Q}_1^T \\ &= \mathbf{Q}_1\Pi^T\mathbf{T}\Lambda\mathbf{T}^T\Pi\mathbf{Q}_1^T \end{aligned} \end{equation*}$$

$$\Rightarrow \mathbf{B} = \mathbf{Q}\Lambda\mathbf{Q}^T, \; \; \mathbf{Q} = \mathbf{Q}_1\Pi^T\mathbf{T}, \; \; \Lambda = \mathbf{T}^T\Pi\mathbf{C}\Pi^T\mathbf{T}$$

which concludes the demonstration.