This article presents various methods for calculating eigendecompositions. These methods are Power, Inverse, “Pure” QR, QR with Shifts, and the Jacobi methods, and they will all be implemented in the D programming language. Previous articles have discussed different matrix factorization techniques. These factorizations transform the matrix into a more useful form that allows us to solve various problems. Eigendecomposition is no different, we transform a symmetric matrix \(A \in \mathbb{R}^{n \times n}\) into a diagonal matrix \(\Lambda\) by carrying out transformations on \(A\). The methods we present mostly rely on inducing zeros in the matrix by rotations or reflections. They use Givens rotations and Householder reflections (that we implemented in previous articles), and as some of the method names suggest we also use the QR method to carry out some of the transformations (which are themselves often rotations and reflections). These transformations are accumulated as they are applied to the matrix. The accumulated transformations are the eigenvectors in our decomposition, while the diagonalized matrix are the eigenvalues. The eigenvalue diagonal matrix \(\Lambda\) is therefore a transformed version of the initial matrix \(A\). This means that if we apply a function to the eigenvalues, and carry out a reverse transformation using the eigenvector matrix, we end up with a matrix that has been altered in such a way that it is as if we applied the function directly to the initial matrix itself. For example, a matrix inverse can be obtained by reverse transforming inverted eigenvalues.
The following references were useful for sources of algorithms and theory for this article:
As usual, the code presented here is for study purposes only and should not be used in production systems. In addition, the methods we present are not suitable for eigendecomposition of large matrices, and as indicated in the introduction, the methods we present are appropriate for square symmetric, full rank matrices, with real eigenvalues.
Let \(A \in \mathbb{R}^{n \times n}\) be a nonsingular symmetric square matrix. A non-zero vector \(x \in \mathbb{R}^{n}\) is an eigenvector if it transforms \(A\) to its corresponding eigenvalue \(\lambda\):
$$ x^{T} A x = \lambda $$
Also, \(x_{i}^{T} A x_{i} = \lambda_{i}\), where \(i = 1, \ldots, n\), in matrix form:
$$ X^{T} A X = \Lambda $$
\(X\) is a unitary matrix \(X^{T} X = X X^{T} = I\), so \(X^{T} = X^{-1}\):
$$ A = X \Lambda X^{T} $$
where \(X\) is the eigenvector matrix and \(\Lambda\) is the diagonal eigenvalue matrix. We can also write the relation for an eigenvalue and eigenvector as \(A x = \lambda\). The characteristic polynomial of degree \(n\) is given by:
$$ p_{A}(z) = \text{det}(z I - A) $$
The roots of this polynomial \(\text{det}(\lambda I - A) = 0\) gives the eigenvalues.
The Power iteration is based on the fact that the sequence:
$$ \frac{x}{\lVert x\rVert}, \frac{Ax}{\lVert Ax\rVert}, \frac{A^{2}x}{\lVert A^{2}x\rVert}, \frac{A^{3}x}{\lVert A^{3}x\rVert}, \ldots $$
(where \(\lVert x\rVert\) is the euclidean norm of \(x\)), converges to the largest eigenvalue of \(A\). So the algorithm for the Power iteration is given by:
\(v^{(0)} = \text{ some vector with } \lVert v^{(0)} \rVert = 1\)
\(\text{for } k = 1, 2, \ldots\)
\(\qquad w = A v^{(k - 1)}\)
\(\qquad v^{(k)} = w/\lVert w \rVert\)
\(\qquad \lambda^{(k)} = (v^{(k)})^{T} A (v^{(k)})\)
The initial \(v\) is chosen at random and then normalized.
The inverse iteration is based on the fact that the eigenvectors of \((A - \mu I)^{-1}\) have the same eigenvectors as A. So carrying out a power iteration on \((A - \mu I)^{-1}\) is sufficient to obtain the largest eigenvalue. The Rayleigh (quotient) iteration assigns the current estimate of the eigenvalue to \(\mu\) which increases the convergence rate. The algorithm is given by:
\(v^{(0)} = \text{ some vector with } \lVert v^{(0)} \rVert = 1\)
\(\lambda^{(0)} = (v^{(0)})^{T} A (v^{(0)}) \text{ = corresponding Rayleigh quotient}\)
\(\text{for } k = 1, 2, \ldots\)
\(\qquad Solve (A - \lambda^{(k - 1)} I)w = v^{(k - 1)} \text{ for } w \)
\(\qquad v^{(k)} = w/\lVert w \rVert\)
\(\qquad \lambda^{(k)} = (v^{(k)})^{T} A (v^{(k)})\)
So far we have discussed how to obtain only the largest eigenvalue from a matrix. The rest of the methods discuss how to obtain all the eignvalues and eigenvectors of a square symmetric matrix. The QR algorithm for eigendecomposition is given by
\(A^{(0)} = A\)
\(\text{for } k = 1, 2, \ldots\)
\(\qquad Q^{(k)}R^{(k)} = A^{(k - 1)} \qquad \text{QR factorization of } A^{(k - 1)}\)
\(\qquad A^{(k - 1)} = R^{(k)}Q^{(k)}\)
and as \(k\) increases, \(A^{(k)} \rightarrow \Lambda\), and accumulating the \(Q^{(k)}\) matrices over the whole iteration gives the matrix of eigenvectors \(X = Q^{(1)} Q^{(2)} \ldots Q^{(k)} \ldots\). The algorithm is sometimes referred to as the “Pure QR” algorithm or QR without shifts.
A tridiagonal matrix is of the form:
$$
T =
\begin{pmatrix}
a_{1} & b_{1} & \quad & \quad & \quad\\
c_{1} & a_{2} & b_{2} & \quad & \quad\\
\quad & c_{2} & \ddots & \ddots & \quad\\
\quad & \quad & \ddots & \ddots & b_{n - 1}\\
\quad & \quad & \quad & c_{n - 1} & a_{n}\\
\end{pmatrix}
$$
The matrix \(A \in \mathbb{R}^{n \times n}\) can be tridiagonalized \(A = Q^{T} T Q\), where \(Q\) is a unitary matrix.
Beginning the eigendecomposition process with a tridiagonalized matrix accelerates convergence - even when the tridiagonalization is taken into account. Tridiagonalization of a matrix can be done using Householder reflections and the algorithm is given below:
\(\text{for } k = 1 \ldots m - 2\)
\(\qquad v_{k} = \text{sign}(x_{1})\lVert x \rVert_{2} e_{1} + x\)
\(\qquad v^{(k)} = w/\lVert w \rVert\)
\(\qquad A_{(k + 1):m, k:m} = A_{(k + 1):m, k:m} - 2 v_{k}(v_{k}^{T} A_{(k + 1):m, k:m})\)
\(\qquad A_{1:m, (k + 1):m} = A_{1:m, (k + 1):m} - 2 (A_{1:m, (k + 1):m} v_{k}) v_{k}^{T}\)
where \(x_{1}\) is the first element of \(x\), and \(e_{1} = \{1, 0, \ldots, 0\} \) is the unit vector for the first component of \(x\).
QR algorithm with shifts introduces the same idea as the Rayleigh iteration to the Pure QR algorithm and is given below:
\((Q^{(0)})^{T} A^{(0)} Q^{(0)} \qquad A^{(0)}\text{ is a tridiagonalization of A}\)
\(\text{for } k = 1, 2 \ldots\)
\(\qquad \text{Pick a shift} \mu^{(k)}\)
\(\qquad Q^{(k)} R^{(k)} = A^{(k - 1)} - \mu^{(k)} I \qquad \text{QR factorization of } A^{(k - 1)} - \mu^{(k)}I\)
\(\qquad A^{(k)} = R^{(k)}Q^{(k)} - \mu^{(k)}I\)
\(\qquad \text{set } A_{n - 1, n} = A_{n, n - 1} = 0 \text{ and recurse on } A_{1:(n - 1), 1:(n - 1)}\)
There are different suggestions for how to choose \(\mu^{(k)}\), some suggestions are that \(\mu^{(k)} = A_{n, n}^{k - 1}\) (Trefethen, and David Bau III), or to use Wilkinson’s shift:
\(\mu = t_{n, n} - \frac{t_{n, n - 1}^{2}}{\tau + \text{sign}(\tau)\sqrt{\tau^{2} + t_{n, n - 1}^{2}}}, \qquad \tau = \frac{t_{n - 1, n - 1} - t_{n, n}}{2}\)
( p55-10, Handbook of Linear Algebra, 2nd Edition, Edited by Leslie Hogben ). However, the effect of the algorithm is to diagonalize \(A^{(k)}\) meaning that if \(\mu^{(k)}\) evaluates to \(A_{n, n}^{(k - 1)}\), which from those equations it could, \(A^{(k - 1) - \mu^{(k)} I}\) can become rank deficient if it is nearly (but not operationally) diagonal, and so can not be factorized using a standard QR decomposition. So in our implementation we shall use \(\mu^{(k)} = 0.8A_{n, n}^{(k - 1)}\) as a pragmatic compromise, though this is not necessarily being recommended as standard practice. Again, the \(Q^{(k)}\) matrices are accumulated as before to become the eigenvectors at convergence.
The Jacobi eigendecomposition method multiplies the matrix \(A\) with appropriate Jacobi (Givens) rotation matrices until \(A{(k)}\) becomes diagonal, while as before accumulating the rotation matrices which become the eigenvectors upon convergence. The Jacobi rotation matrix is the same as Givens rotation matrix:
$$
J(p, q, \theta) =
{
\begin{pmatrix}
1 & \ldots & 0 & \ldots & 0 & \ldots & 0 \\
\vdots & \ddots & \vdots & \quad & \vdots & \quad & \vdots \\
0 & \ldots & c & \ldots & s & \ldots & 0 \\
\vdots & \quad & \vdots & \ddots & \vdots & \quad & \vdots \\
0 & \ldots & -s & \ldots & c & \ldots & 0 \\
\vdots & \quad & \vdots & \quad & \vdots & \ddots & \vdots \\
0 & \ldots & 0 & \ldots & 0 & \ldots & 1 \\
\end{pmatrix}
\begin{matrix}
\quad \\
\quad \\
p \\
\quad \\
q \\
\quad \\
\quad \\
\end{matrix}
\atop
\begin{matrix}
& \quad & p & \quad & \quad q & \quad & \quad \\
\end{matrix}
}
$$
The calculation of cosine \(c\) and sine \(s\) values are given by:
\(\text{function } [c, s] = \text{symSchur2}(A, p, q)\)
\(\quad \text{if } A_{p, q} \neq 0 \)
\(\quad \quad \tau = (A_{q, q} - A_{p, p})/(2 A_{p, q}) \)
\(\quad \quad \text{if } \tau \leq 0 \)
\(\quad \quad \quad t = 1/(\tau + \sqrt{1 + \tau^{2}}) \)
\(\quad \quad else \)
\(\quad \quad \quad t = 1/(\tau - \sqrt{1 + \tau^{2}}) \)
\(\quad \quad end \)
\(\quad \quad c = 1/\sqrt{1 + t^{2}},\quad s = t c \)
\(\quad else \)
\(\quad \quad c = 1,\quad s = 0 \)
\(\quad end \)
The Frobenus norm of a matrix is given by:
\(
\lVert A \rVert_{F} = \sqrt{\sum_{i = 1}^{n}\sum_{j = 1}^{n}{a_{i, j}^{2}}}
\)
we can also define \(\text{off}(A)\) to be the Frobenius norm except that the diagonal elements are not aggregated, so \(i\neq j\).
The Jacobi eigenvalue decomposition is given by:
\(X^{(0)} = I_{n}, \delta = tol \cdot \lVert A \rVert_{F}\)
\(\text{while off}(A^{(k)}) > \delta\)
\(\quad \text{Choose } (p, q) \text{ so } |a_{p, q}| = max_{i \neq j} |a_{i, j}|\)
\(\quad [c, s] = \text{symSchur2}(A, p, q)\)
\(\quad A^{(k)} = J(p, q, \theta)^{T} A^{(k - 1)} J(p, q, \theta)\)
\(\quad X^{(k)} = X^{(k - 1)}J(p, q, \theta)\)
\(\text{end}\)
So that \(A^{(k)}\) eventually becomes the eigenvalue matrix and \(X^{(k)}\) becomes the eigenvector matrix. However, it is computationally expensive to search through the matrix each iteration to obtain the largest off-diagonal element, so we also have the Cyclic Jacobi algorithm:
\(X^{(0)} = I_{n}, \delta = tol \cdot \lVert A \rVert_{F}\)
\(\text{while off}(A^{(k)}) > \delta\)
\(\quad \text{for } p = 1 \ldots n - 1 \)
\(\quad\quad \text{for } q = p + 1 \ldots n \)
\(\quad\quad\quad [c, s] = \text{symSchur2}(A, p, q)\)
\(\quad\quad\quad A^{(k)} = J(p, q, \theta)^{T} A^{(k - 1)} J(p, q, \theta)\)
\(\quad\quad\quad X^{(k)} = X^{(k - 1)}J(p, q, \theta)\)
\(\quad\quad \text{end}\)
\(\quad \text{end}\)
\(\text{end}\)
Firstly, our dispatch enumeration:
enum EigenMethod {Power, Rayleigh, PureQR, QRWithShifts, ClassicJacobi, CyclicJacobi}
alias Power = EigenMethod.Power;
alias Rayleigh = EigenMethod.Rayleigh;
alias PureQR = EigenMethod.PureQR;
alias QRWithShifts = EigenMethod.QRWithShifts;
alias ClassicJacobi = EigenMethod.ClassicJacobi;
alias CyclicJacobi = EigenMethod.CyclicJacobi;
We shall work with 2 matrices, \(A\):
auto A = matrix([2.326121, 1.711786, 2.502242, 1.715707, 1.417077, 2.319703, 1.851844,
1.711786, 2.219194, 2.163875, 1.932815, 1.467704, 2.551077, 1.899822,
2.502242, 2.163875, 3.473702, 2.084781, 2.135963, 3.192984, 2.467568,
1.715707, 1.932815, 2.084781, 1.998455, 1.626024, 2.465247, 1.820398,
1.417077, 1.467704, 2.135963, 1.626024, 1.753152, 2.243601, 1.686944,
2.319703, 2.551077, 3.192984, 2.465247, 2.243601, 3.717349, 2.735672,
1.851844, 1.899822, 2.467568, 1.820398, 1.686944, 2.735672, 2.392183], 7, 7);
"A:\n".writeln(A);
output:
A:
Matrix(7 x 7)
2.326121 1.711786 2.502242 1.715707 1.417077 2.319703 1.851844
1.711786 2.219194 2.163875 1.932815 1.467704 2.551077 1.899822
2.502242 2.163875 3.473702 2.084781 2.135963 3.192984 2.467568
1.715707 1.932815 2.084781 1.998455 1.626024 2.465247 1.820398
1.417077 1.467704 2.135963 1.626024 1.753152 2.243601 1.686944
2.319703 2.551077 3.192984 2.465247 2.243601 3.717349 2.735672
1.851844 1.899822 2.467568 1.820398 1.686944 2.735672 2.392183
and \(B\):
B = matrix([0.8952, 0, 0, 2.592e-18,
0, 0.06972, 0.1582, 0,
-1.501e-27, 0.1582, 0.5874, 0,
-9.738e-17, 0, -5.318e-16, 0.1919],
4, 4);
"Initial (B):\n".writeln(B);
output:
B:
Matrix(4 x 4)
4.501 0.6122 2.141 2.039
0.6122 2.621 -0.4941 -1.216
2.141 -0.4941 1.154 -0.159
2.039 -1.216 -0.159 -0.9429
The implementation of the Power method algorithm described above is given by:
import std.algorithm: reduce;
import std.algorithm.comparison: max;
auto eigen(EigenMethod method, T, T eps = max(1E-13, T.epsilon))(Matrix!(T) A)
if(isFloatingPoint!(T) && method == Power)
{
immutable auto n = A.ncol;
assertIsSquare(A);
T[] v = rand!(T)(n);
v = normalize(v);
auto v_old = v.dup;
T ev;
T conv = T.infinity;
long k = 0;
while(conv > eps)
{
auto w = gemv!(Column)(A, v);
v = normalize(w);
/* Convergence is based on eigenvector stability */
conv = map!((a, b) => pow(a - b, 2.0))(v, v_old)
.reduce!((a, b) => a + b)
.sqrt;
v_old = v;
++k;
}
/* We avoid calculating the eigenvalue until the process converges */
ev = mapply!((a, b) => a * b)(v, gemv!(Column)(A, v))
.reduce!((a, b) => a + b);
return ev;
}
when we run matrix \(A\) through this algorithm:
"eigen(A) (Power):\n".writeln(eigen!(Power)(A));
the output is:
eigen(A) (Power):
15.4924
and for \(B\):
"eigen(B) (Power):\n".writeln(eigen!(Power)(B));
the output is:
eigen(B) (Power):
6.00556
The implementation of the Rayleigh method is given by:
auto eigen(EigenMethod method,T, T eps = max(1E-13, T.epsilon))(Matrix!(T) A)
if(isFloatingPoint!(T) && method == Rayleigh)
{
immutable auto n = A.ncol;
assertIsSquare(A);
T[] v = rand!(T)(n);
v = normalize(v);
T conv = T.infinity;
long k = 0;
T ev = mapply!((a, b) => a * b)(v, gemv!(Column)(A, v))
.reduce!((a, b) => a + b);
while(conv > eps)
{
auto ev_old = ev;
auto B = A - ev*eye!(T)(n);
//Here we solve by LU giving us a power iteration (A - muI)^{-1}
auto w = luSolve(B, v);
v = normalize(w);
//Convergence by eigenvalue change
ev = mapply!((a, b) => a * b)(v, gemv!(Column)(A, v))
.reduce!((a, b) => a + b);
conv = abs(ev - ev_old);
++k;
}
return ev;
}
when we run matrix \(A\) through this algorithm:
"eigen(A) (Rayleigh):\n".writeln(eigen!(Rayleigh)(A));
the output is:
eigen(A) (Rayleigh):
15.4924
and for \(B\):
"eigen(B) (Rayleigh):\n".writeln(eigen!(Rayleigh)(B));
the output is:
eigen(B) (Rayleigh):
3.0454
The implementation of the QR eigendecomposition described above is given by:
auto eigen(EigenMethod method, T, T eps = max(1E-13, T.epsilon))(Matrix!(T) A)
if(isFloatingPoint!(T) && method == PureQR)
{
immutable auto n = A.ncol;
assertIsSquare(A);
auto Ak = A.dup;
auto U = eye!(T)(n);
T conv = T.infinity;
auto v = diag(Ak);
long k = 0;
while(conv > eps)
{
auto v_old = v;
// Carry out a QR decomposition
auto qrObj = Ak.qr!(ModifiedGramSchmidt);
//Update A^{(k)}
Ak = gemm(qrObj.R, qrObj.Q);
//Accumulate the transformation
U = gemm(U, qrObj.Q);
v = diag(Ak);
// Convergence by eigenvalue change norm
conv = map!((a, b) => pow(a - b, 2.0))(v, v_old)
.reduce!((a, b) => a + b)
.sqrt;
++k;
}
auto obj = Eigen!(T)(diag(Ak), U);
auto evals = obj.values;
auto idx = seq!(long)(0, n - 1);
//Sort eigenvectors and values
sort!("a[0] > b[0]")(zip(evals, idx));
auto evects = zeros!(T)(n ,n);
for(long i = 0; i < n; ++i)
{
evects[i] = obj.vectors[idx[i]];
}
return Eigen!(T)(evals, evects);
}
tools for sorting are given by:
import std.range: zip;
import std.algorithm.sorting: sort;
The struct Eigen
is given by:
struct Eigen(T)
if(isFloatingPoint!(T))
{
T[] values;
Matrix!(T) vectors;
}
From now on, the eigenvalue and eigenvector outputs will only be presented for this algorithm, the outputs for the rest are effectively the same. Eigenvalues and eigenvectors for A:
"eigen(A) (PureQR):\n".writeln(eigen!(PureQR)(A));
output:
eigen(A) (PureQR):
Eigen!double([15.4924, 0.895303, 0.632493, 0.40729, 0.265876, 0.161864, 0.0249189], Matrix(7 x 7)
0.3414 -0.562 -0.5583 0.06242 0.3344 0.2024 -0.3151
0.3433 0.435 -0.4709 -0.01698 -0.4495 -0.4122 -0.3146
0.4468 -0.5458 0.2371 0.07683 -0.486 -0.2401 0.3828
0.3348 0.3483 -0.2316 0.4309 0.3588 0.02034 0.631
0.3038 0.06922 0.5325 0.5201 0.2517 -0.1997 -0.4957
0.4761 0.2563 0.1986 -0.1966 -0.2241 0.7565 -0.08388
0.3673 0.07289 0.1839 -0.7036 0.4558 -0.3448 0.0649
and the output for B is given by:
"eigen(B) (PureQR):\n".writeln(eigen!(PureQR)(B));
output:
Eigen!double([6.00556, 3.0454, 0.602398, -2.31966], Matrix(4 x 4)
0.8894 0.1003 0.2496 -0.3697
0.0153 0.9593 -0.02382 0.281
0.3828 -0.1172 -0.8638 0.3059
0.2495 -0.2366 0.437 0.8311
The implementation of Householder reduction algorithm described earlier is given below:
auto householderReduction(bool inplace = false, T)(ref Matrix!(T) A)
if(isFloatingPoint!(T))
{
immutable auto n = A.ncol;
// Matrices of this size do not require tridiagonalization
if(n < 3)
{
return HouseHolderReduction!(T)(A, eye!(T)(n));
}
enum eps = max(1E-13, T.epsilon);
assertIsSquare(A);
static if(inplace)
{
auto H = A;
}else{
auto H = A.dup;
}
auto Q = eye!(T)(n);
for(long k = 0; k < n - 1; ++k)
{
//Construct the Householder vector
auto vk = H[(k + 1)..$, k];
vk[0, 0] += sign(vk[0, 0]) * norm(vk.array);
vk = matrix(normalize(vk.array), vk.nrow, 1);
//The transformations are accumulated
updateQ(Q, F(vk.array), k + 1);
//Apply the transformation to the matrix
H[(k + 1)..$, k..$] = H[(k + 1)..$, k..$] -
2*outer(vk.array,
gemm!(CblasTrans, CblasNoTrans)(vk, H[(k + 1)..$, k..$]).array);
H[0..$, (k + 1)..$] = H[0..$, (k + 1)..$] -
2*outer(gemm(H[0..$, (k + 1)..$], vk).array,
vk.array);
}
//cleanup
for(long i = 0; i < n; ++i)
{
for(long j = 0; j < n; ++j)
{
if(abs(H[i, j]) < eps)
{
H[i, j] = 0;
}
}
}
return HouseHolderReduction!(T)(H, Q);
}
The HouseholderReduction
struct is given by:
struct HouseHolderReduction(T)
if(isFloatingPoint!(T))
{
//The tridiagonal matrix
Matrix!(T) H;
//The accumulated householder transformations
Matrix!(T) Q;
this(Matrix!(T) H, Matrix!(T) Q)
{
this.H = H;
this.Q = Q;
}
}
Runnning this algorithm on our \(A\) matrix:
"HouseHolder tridiagonalization:\n".writeln(householderReduction(A));
gives the appropriate trigdiagonalized matrix:
HouseHolder tridiagonalization:
HouseHolderReduction!double(Matrix(7 x 7)
2.326 -4.791 0 0 0 0 0
-4.791 13.7 -0.8955 0 0 0 0
0 -0.8955 0.4469 -0.2454 0 0 0
0 0 -0.2454 0.5068 -0.2257 0 0
0 0 0 -0.2257 0.3177 -0.06213 0
0 0 0 0 -0.06213 0.2188 0.08336
0 0 0 0 0 0.08336 0.368
, Matrix(7 x 7)
1 0 0 0 0 0 0
0 -0.3573 0.1401 -0.2435 0.8733 0.1704 -0.04232
0 -0.5223 -0.7678 0.3365 0.006928 0.01998 0.1553
0 -0.3581 -0.03496 -0.7785 -0.377 0.1729 0.3041
0 -0.2958 0.4782 0.4531 -0.1709 0.5891 0.3197
0 -0.4842 0.3927 0.1264 -0.07165 -0.7555 0.1395
0 -0.3865 0.08248 -0.01337 -0.2468 0.1508 -0.8718
The implementation of QR with shifts eigendecomposition discribed earlier is given by:
//This is an internal "workshorse function" function
private auto _qrwithshifts(T, T eps)(HouseHolderReduction!(T) H)
{
immutable n = H.H.ncol;
if(n == 1)
{
return H;
}else{
auto I = eye!(T)(n);
auto Q = H.Q;
//Tridiagonalization happens here
H = H.H.householderReduction;
H.Q = gemm(H.Q, Q);
while(H.H[$ - 1, $ - 2] > eps)
{
//Pragmatic shift
T mu = 0.8*H.H[$ - 1, $ - 1];
auto muI = mu*I;
//QR decomposition
auto qrObj = (H.H - muI).qr!(ModifiedGramSchmidt);
H.H = gemm(qrObj.R, qrObj.Q) + muI;
//Accumulate the transformations
H.Q = gemm(H.Q, qrObj.Q);
}
H.H[$ - 2, $ - 1] = 0;
H.H[$ - 1, $ - 2] = 0;
if(n > 2)
{
//Recurse on the rest of the matrix
auto oupper = _qrwithshifts!(T, eps)(
HouseHolderReduction!(T)(H.H[0..($ - 2), 0..($ - 2)], H.Q[0..($ - 2), 0..($ - 2)])
);
H.H[0..($ - 2), 0..($ - 2)] = oupper.H;
H.Q[0..($ - 2), 0..($ - 2)] = oupper.Q;
}
}
return H;
}
auto eigen(EigenMethod method, T, T eps = max(1E-13, T.epsilon))(Matrix!(T) A)
if(isFloatingPoint!(T) && method == QRWithShifts)
{
immutable auto n = A.ncol;
assertIsSquare(A);
auto obj = _qrwithshifts!(T, eps)(HouseHolderReduction!(T)(A, eye!(T)(n)));
auto evals = diag(obj.H);
auto idx = seq!(long)(0, n - 1);
//Sort eigenvectors and values
sort!("a[0] > b[0]")(zip(evals, idx));
auto evects = zeros!(T)(n ,n);
for(long i = 0; i < n; ++i)
{
evects[i] = obj.Q[idx[i]];
}
return Eigen!(T)(evals, evects);
}
The implementation of the Frobenius norm is given below, it either calculates the Frobenius norm (frobenius == true
) or the \(\text{off}(A)\) function when (frobenius == false
):
T fnorm(bool frobenius = true, T)(Matrix!(T) A)
{
T result = 0;
for(long i = 0; i < A.ncol; ++i)
{
for(long j = 0; j < A.nrow; ++j)
{
// Frobenius norm or off(A) determined at compile time
static if(frobenius){
result += A[i, j]*A[i, j];
}else{
if(i != j)
result += A[i, j]*A[i, j];
}
}
}
return result.sqrt;
}
// alias for off(A)
auto off(T)(Matrix!(T) A)
{
pragma(inline, true);
return fnorm!(false)(A);
}
The Jacobi rotation matrix object is given by:
struct Jacobi(T)
if(isFloatingPoint!(T))
{
T c;
T s;
size_t p;
size_t q;
this(Matrix!(T) A, size_t p, size_t q)
{
this.p = p;
this.q = q;
if(A[p, q] != 0)
{
T tau = (A[q, q] - A[p, p])/(2*A[p, q]);
T tmp = sqrt(1 + pow(tau, 2.0));
T t;
if(tau >= 0)
{
t = 1/(tau + tmp);
}else{
t = 1/(tau - tmp);
}
this.c = 1/sqrt(1 + pow(t, 2.0));
this.s = t*c;
}else{
this.c = 1;
this.s = 0;
}
}
}
and the left and right hand multiplication of the Jacobi matrix is given by:
auto mult(T)(Jacobi!(T) J, ref Matrix!(T) A)
{
immutable auto n = A.ncol;
for(long j = 0; j < n; ++j)
{
T t1 = A[J.p, j];
T t2 = A[J.q, j];
A[J.p, j] = J.c*t1 - J.s*t2;
A[J.q, j] = J.s*t1 + J.c*t2;
}
return A;
}
and:
auto mult(T)(ref Matrix!(T) A, Jacobi!(T) J)
{
immutable auto m = A.nrow;
for(long j = 0; j < m; ++j)
{
T t1 = A[j, J.p];
T t2 = A[j, J.q];
A[j, J.p] = J.c*t1 - J.s*t2;
A[j, J.q] = J.s*t1 + J.c*t2;
}
return A;
}
respectively. We find the maximum off-diagonal element using:
auto offMax(T)(Matrix!(T) A)
{
long p, q = 0; T tmp = -T.infinity;
for(long j = 0; j < A.ncol; ++j)
{
for(long i = 0; i < A.nrow; ++i)
{
if(i != j)
{
auto comp = abs(A[i, j]);
if(tmp < comp)
{
tmp = comp;
p = i; q = j;
}
}
}
}
return [p, q];
}
The implementation of Classic Jacobi eigendecomposition described earlier is given by:
//Workhorse function
private auto _classic_jacobi(T, T eps)(Matrix!(T) A)
{
immutable n = A.ncol;
auto V = eye!(T)(n);
immutable delta = eps*fnorm!(true)(A);
while(off(A) > delta)
{
//Calclate the location maximum off-diagonal element
auto coord = offMax(A);
immutable p = coord[0];
immutable q = coord[1];
//Transformation of matrix A
auto J = A.Jacobi!(T)(p, q);
J.mult(A); A.mult(J);
//Accumulation of transformations
V.mult(J);
}
return Eigen!(T)(diag(A), V);
}
auto eigen(EigenMethod method, T, T eps = max(1E-13, T.epsilon))(Matrix!(T) A)
if(isFloatingPoint!(T) && method == ClassicJacobi)
{
// ... as before ...
auto obj = _classic_jacobi!(T, eps)(A.dup);
// ... as before ...
return Eigen!(T)(evals, evects);
}
The implementation of the Cyclic Jacobi eigendecomposition described earlier is given below:
private auto _cyclic_jacobi(T, T eps)(Matrix!(T) A)
{
immutable n = A.ncol;
auto V = eye!(T)(n);
immutable delta = eps*fnorm!(true)(A);
while(off(A) > delta)
{
for(long p = 0; p < n - 1; ++p)
{
for(long q = p + 1; q < n; ++q)
{
//Transformation of matrix A
auto J = A.Jacobi!(T)(p, q);
J.mult(A); A.mult(J);
//Accumulation of transformations
V.mult(J);
}
}
}
return Eigen!(T)(diag(A), V);
}
auto eigen(EigenMethod method, T, T eps = max(1E-13, T.epsilon))(Matrix!(T) A)
if(isFloatingPoint!(T) && method == CyclicJacobi)
{
// ... as before ...
auto obj = _cyclic_jacobi!(T, eps)(A.dup);
// ... as before ...
return Eigen!(T)(evals, evects);
}
This article describes the main classical algorithms for eigenvalue decomposition from literature and then shows their simple implementation. The implementations are for learning purposes and not for production or for big data applications. So far in the Active Analytics blog, many matrix factorizations have been covered. It is hoped that a few more topics in linear algebra will be covered before moving on since this is a very interesting and useful field. Links to these articles are often posted on various social media websites so readers can contact the author using the email below for any comments/questions or respond to the social media post.
Thank you.