Eigenvalues and eigenvectors

Eigenvalue, eigenvector and generalized eigenvector will be merged in here. This article is a work in progress.

In linear algebra, an eigenvector of a linear transformation is an element in the domain of the linear transformation sent to a scalar multiple of itself; this scalar is called an eigenvalue of the linear transformation. An eigenspace of the linear transformation is a subspace of the domain all of whose elements are eigenvectors associated with the same eigenvalue. The motivation for introducing eigenvalues and eigenvectors is that they simplify many problems in linear algebra: they often allow one to replace a matrix or linear transformation (a complicated object) by a scalar (a much simpler object).

In applied mathematics and physics, the eigenvectors of a matrix or a differential operator often have important physical significance. In classical mechanics, the eigenvectors of the governing equations typically correspond to natural modes of vibration in a body, and the eigenvalues to their frequencies. In quantum mechanics, operators correspond to observables, eigenvectors are also called eigenstates, and the eigenvalues of an operator represent those values of the corresponding variable that have non-zero probability of being measured.

The word eigen is German for "own", "peculiar", or "individual": the most likely translation into English mathematical jargon would be "characteristic", and some older references do use the expressions "characteristic value", "characteristic vector" and so forth, but the more distinctive term "eigenvalue" has now become standard. The related "characteristic polynomial", however, retains this name.

Definition

Formally, we define eigenvectors and eigenvalues as follows. Let A be an n-by-n matrix of real numbers or complex numbers (see below for generalizations). We say that a complex number λ is an eigenvalue of A with eigenvector v ∈ Cⁿ if v is not zero and

Av=\lambda v.

The spectrum of A, denoted σ(A), is the set of all eigenvalues.

If A : V → V is a linear operator on some vector space V, v is a non-zero vector in V and λ is a scalar (possibly zero) such that

Av=\lambda v

then we say that v is an eigenvector of the operator A, and its associated eigenvalue is λ. Note that if v is an eigenvector with eigenvalue λ, then any non-zero multiple of v is also an eigenvector with eigenvalue λ. In fact, all the eigenvectors with associated eigenvalue λ, together with 0, form a subspace of V, the eigenspace for the eigenvalue λ.

Since a vector v satisfies $Av=\lambda v$ , or equivalently $(\lambda I-A)v=0$ , if and only if it is either zero or an eigenvector for λ, we can also define the eigenspace by saying that it is the nullspace of the matrix $(\lambda I-A)$ . Here $I$ denotes the identity matrix, $Iv=v$ .

Examples

Some special cases of linear transformations of two-dimensional space R² are illuminating:

rotations: no real eigenvectors (complex eigenvalue, eigenvector pairs exist). Example:
$A={\begin{bmatrix}0&-1\\1&0\end{bmatrix}}.$
reflection: eigenvectors are perpendicular and parallel to the line of symmetry, the eigenvalues are -1 and 1, respectively. Example:
$A={\begin{bmatrix}1&0\\0&-1\end{bmatrix}}.$
scaling:
- uniform scaling: all vectors are eigenvectors, and the eigenvalue is the scale factor. Example:
$A={\begin{bmatrix}2&0\\0&2\end{bmatrix}}.$
- directional scaling: eigenvalues are the scale factor and 1
- directionally differential scaling: eigenvalues are the scale factors
projection onto a line: vectors on the line are eigenvectors with eigenvalue 1 and vectors perpendicular to the line are eigenvectors with eigenvalue 0. Example:
$A={\begin{bmatrix}0&0\\0&1\end{bmatrix}}.$

Computing eigenvalues

Suppose that we want to compute the eigenvalues of a given matrix. If the matrix is small, we can compute them symbolically using the characteristic polynomial. However, this is often impossible for larger matrices, in which case we must use a numerical method.

Symbolic computations using the characteristic polynomial

An important tool for describing eigenvalues of square matrices is the characteristic polynomial: saying that λ is an eigenvalue of A is equivalent to stating that the system of linear equations (A - λid_V) v = 0 (where id_V is the identity matrix) has a non-zero solution v (namely an eigenvector), and so it is equivalent to the determinant det(A - λ id_V) being zero. The function p(λ) = det(A - λid_V) is a polynomial in λ since determinants are defined as sums of products. This is the characteristic polynomial of A: the eigenvalues of a matrix are the zeros of its characteristic polynomial.

(Sometimes the characteristic polynomial is taken to be det(λid_V - A) instead, which is the same polynomial if V is even-dimensional but has the opposite sign if V is odd-dimensional. This has the slight advantage that its leading coefficient is always 1 rather than -1.)

It follows that we can compute all the eigenvalues of a matrix A by solving the equation $p_{A}(\lambda )=0$ . If A is an n-by-n matrix, then $p_{A}$ has degree n and A can therefore have at most n eigenvalues. Conversely, the fundamental theorem of algebra says that this equation has at least one solution in the complex numbers, so every matrix has at least one eigenvalue. However, this eigenvalue may be complex, even if the matrix has real coefficients: an example of a matrix with no real eigenvalues is the 90-degree rotation

{\begin{bmatrix}0&1\\-1&0\end{bmatrix}}

whose characteristic polynomial is $x^{2}+1$ and so its eigenvalues must be square roots of -1.

The Cayley-Hamilton theorem states that every square matrix satisfies its own characteristic polynomial, that is, $p_{A}(A)=0$ .

Eigenvalues of 2×2 matrices

An analytic solution for the eigenvalues of 2×2 matrices can be obtained directly from the quadratic formula: if

A={\begin{bmatrix}a&b\\c&d\end{bmatrix}}

then the characteristic polynomial is

{\rm {det}}{\begin{bmatrix}a-\lambda &b\\c&d-\lambda \end{bmatrix}}=(a-\lambda )(d-\lambda )-bc=\lambda ^{2}-(a+d)\lambda +(ad-bc)

(notice that the coefficients, up to sign, are ${\rm {tr}}(A)=a+d$ and ${\rm {det}}(A)=ad-bc$ ) so the solutions are

\lambda ={\frac {a+d}{2}}\pm {\frac {\sqrt {4bc+(a-d)^{2}}}{2}}

In principle a formula for the eigenvalues of a 3x3 or 4x4 matrix could be calculated in the same way, using the formulae for the roots of a cubic or quartic equation.

Numerical computations

Main article: eigenvalue algorithm.

In practice, eigenvalues of large matrices are not computed using the characteristic polynomial. Moreover, the Abel-Ruffini theorem implies that there is no general algorithm for finding the zeros of the characteristic polynomial. Therefore, general eigenvalues algorithms are iterative. The easiest method is the power method: we choose a random vector v and compute Av, $A^{2}v$ , $A^{3}v$ , ... This sequence will after normalization almost always converge to an eigenvector corresponding to the dominant eigenvalue. This algorithm is easy, but not very useful by itself. However, popular methods such as the QR algorithm are based on it.

Example computation

Let us determine the eigenvalues of the matrix

A={\begin{bmatrix}0&1&-1\\1&1&0\\-1&0&1\end{bmatrix}}

which represents a linear operator R³ -> R³.

Identifying eigenvalues

We first compute the characteristic polynomial of A:

p(\lambda )=\det(A-\lambda I)=\det {\begin{bmatrix}-\lambda &1&-1\\1&1-\lambda &0\\-1&0&1-\lambda \end{bmatrix}}=-\lambda ^{3}+2\lambda ^{2}+\lambda -2.

This polynomial factors to $p(\lambda )=-(\lambda -2)(\lambda -1)(\lambda +1)$ . Therefore, the eigenvalues of A are 2, 1 and −1.

Identifying eigenvectors

With the eigenvalues in hand, we can solve sets of simultaneous linear equations to determine the corresponding eigenvectors. For example, one can check that

A{\begin{bmatrix}\;1\\\;1\\-1\end{bmatrix}}={\begin{bmatrix}\;2\\\;2\\-2\end{bmatrix}}=2{\begin{bmatrix}\;1\\\;1\\-1\end{bmatrix}}

and therefore 2 is an eigenvalue of A and we have found a corresponding eigenvector.

Note that if A is a real matrix, the characteristic polynomial will have real coefficients, but not all its roots will necessarily be real. The complex eigenvalues come in pairs which are conjugates; they are all associated to complex eigenvectors.

In general, if v₁, ..., v_m are eigenvectors with different eigenvalues λ₁, ..., λ_m, then the vectors v₁, ..., v_m are necessarily linearly independent.

The spectral theorem for symmetric matrices states that, if A is a real symmetric n-by-n matrix, then all its eigenvalues are real, and there exist n linearly independent eigenvectors for A which are mutually orthogonal. Symmetric matrices are commonly encountered in engineering.

Our example matrix from above is symmetric, and three mutually orthogonal eigenvectors of A are

v_{1}={\begin{bmatrix}\;1\\\;1\\-1\end{bmatrix}},\quad v_{2}={\begin{bmatrix}\;0\;\\1\\1\end{bmatrix}},\quad v_{3}={\begin{bmatrix}\;2\\-1\\\;1\end{bmatrix}}.

These three vectors form a basis of R³. With respect to this basis, the linear map represented by A takes a particularly simple form: every vector x in R³ can be written uniquely as

x=x_{1}v_{1}+x_{2}v_{2}+x_{3}v_{3}

and then we have

Ax=2x_{1}v_{1}+x_{2}v_{2}-x_{3}v_{3}.

Multiplicity

The (algebraic) multiplicity of an eigenvalue λ of A is the order of λ as a zero of the characteristic polynomial of A; in other words, it is the number of factors t − λ in the characteristic polynomial. An n-by-n matrix has n eigenvalues, counted according to their algebraic multiplicity, because its characteristic polynomial has degree n.

An eigenvalue of algebraic multiplicity 1 is called a simple eigenvalue.

Occasionally, in an article on matrix theory, one may read a statement like

"the eigenvalues of a matrix A are 4,4,3,3,3,2,2,1,"

meaning that the algebraic multiplicity of 4 is two, of 3 is three, of 2 is two and of 1 is one. This style is used because algebraic multiplicity is the key to many mathematical proofs in matrix theory.

The geometric multiplicity of an eigenvalue λ is the dimension of the eigenspace associated with λ, which we defined above as the nullspace of the matrix λI − A.

The algebraic multiplicity can also be thought of as a dimension: it is the dimension of the associated generalized eigenspace, which is the nullspace of the matrix (λI − A)^k for any sufficiently large k. That is, it is the space of generalized eigenvectors, where a generalized eigenvector is any vector which eventually becomes 0 if λI − A is applied to it enough times successively. Obviously any eigenvector is a generalized eigenvector, and so each eigenspace is contained in the associated generalized eigenspace. This provides an easy proof that the geometric multiplicity is always less than or equal to the algebraic multiplicity. (Do not confuse these with generalized eigenvalue problem, below.)

Consider for example the matrix

{\begin{bmatrix}1&1\\0&1\end{bmatrix}}.

It has only one eigenvalue, namely λ = 1. The characteristic polynomial is $(\lambda -1)^{2}$ , so this eigenvalue has algebraic multiplicity 2. However, the associated eigenspace is spanned by (1, 0)^T, so the geometric multiplicity is only 1.

Decomposition theorem

An n by n matrix A has n linearly independent eigenvectors if and only if it can be decomposed into the form

A=U\Lambda U^{-1}\;

and Λ is a diagonal matrix with all of the eigenvalues on the diagonal. In this case A is said to be diagonalizable. The columns of U will form a basis of eigenvectors and the diagonal entries of Λ are the corresponding eigenvalues: thus, the entries of U can be chosen to be real (as opposed to complex) if and only if there are n linearly independent real eigenvectors. Even with complex coefficients, however, such a matrix U does not always exist; for example

A=\left({\begin{matrix}1&1\\0&1\end{matrix}}\right)

has only one 1-dimensional eigenspace.

There are several generalizations of this decomposition which can cope with the non-diagonalizable case, suited for different purposes:

the singular value decomposition, $A=U\Sigma V^{*}$ where $\Sigma$ is diagonal but U is not necessarily equal to V;
the Jordan normal form, where $A=U\Lambda U^{-1}$ but $\Lambda$ is not quite diagonal;
any matrix A can be written uniquely as A = S + N where S is semisimple (i.e. diagonalizable) and N is nilpotent, and S commutes with N. This is easily found from the Jordan form.
any invertible matrix A can be written uniquely as A = SJ where S is semisimple and J is unipotent, and S commutes with J. This is found from the previous decomposition by taking J = 1 + S^-1N.

Properties

The spectrum is invariant under similarity transformations: the matrices A and P^-1AP have the same eigenvalues for any matrix A and any invertible matrix P. The spectrum is also invariant under transposition: the matrices A and A^T have the same eigenvalues.

A matrix is invertible if and only if zero is not an eigenvalue of the matrix.

A matrix is diagonalizable if and only if the algebraic and geometric multiplicities coincide for all its eigenvalues, which is the same as requiring that all generalized eigenvectors are eigenvectors. In particular, an n-by-n matrix which has n different eigenvalues is always diagonalizable.

The vector space on which the matrix acts is always the direct sum of the generalised eigenspaces (i.e. is spanned by them and they are independent). This is true of the ordinary (non-generalised) eigenspaces if and only if they are equal to the generalized eigenspaces, i.e. if and only if the matrix is diagonalizable.

The location of the spectrum is often restricted if the matrix has a special form:

All eigenvalues of a Hermitian matrix (A = A^*) are real. Furthermore, all eigenvalues of a positive-definite matrix (v^*Av > 0 for all vectors v) are positive.
All eigenvalues of a skew-Hermitian matrix (A = −A^*) are purely imaginary.
All eigenvalues of a unitary matrix (A^-1 = A^*) have absolute value one.
The eigenvalues of a triangular matrix are the entries on the main diagonal. This holds a fortiori for diagonal matrices.

Generally, the trace of a matrix equals the sum of the eigenvalues, and the determinant equals the product of the eigenvalues (counted according to algebraic multiplicity).

Suppose that A is an m-by-n matrix, with m ≤ n, and that B is an n-by-m matrix. Then BA has the same eigenvalues as AB plus m − n eigenvalues equal to zero.

Extensions and generalizations

Infinite-dimensional spaces: eigenvalues of an operator

The concept of eigenvectors can be extended to linear operators acting on infinite-dimensional Hilbert spaces or Banach spaces.

Suppose we have a linear operator A mapping the vector space V to itself. As in the matrix case, we say that λ ∈ C is an eigenvalue of A if there exists a nonzero v ∈ V such that Av = λv.

Suppose now that A is a bounded linear operator on a Banach space V. We say that λ ∈ C is a spectral value of A if the operator $A-\lambda I$ is not invertible, where I denotes the identity operator. Note that by the closed graph theorem, if a bounded operator has an inverse, the inverse is necessarily bounded. The set of all spectral values is the spectrum of A.

If V is finite dimensional, then the spectrum of A is the same of the set of eigenvalues of A. This follows from the fact that on finite-dimensional spaces injectivity of a linear operator A is equivalent to surjectivity of A. However, an operator on an infinite-dimensional space may have no eigenvalues at all, while it always has spectral values.

There are operators on Banach spaces which have no eigenvectors at all. For example, take the bilateral shift on the Hilbert space $\ell ^{2}(\mathbf {Z} )$ ; it is easy to see that any potential eigenvector can't be square-summable, so none exist. However, any bounded linear operator on a Banach space V does have non-empty spectrum. The spectrum σ(A) of the operator A : V → V is defined as

\sigma (A)=\{\lambda \in \mathbf {C} |A-\lambda \mathrm {id} _{V}{\mbox{ is not invertible }}\}.

Then σ(A) is a compact set of complex numbers, and it is non-empty. When A is a compact operator (and in particular when A is an operator between finite-dimensional spaces as above), the spectrum of A is the same as the set of its eigenvalues.

The spectrum of an operator is an important property in functional analysis.

Eigenvalues of a matrix with entries from a ring

Suppose that A is a square matrix with entries in a ring R. An element λ ∈ R is called a right eigenvalue of A if there exists a nonzero column vector x such that Ax=λx, or a left eigenvalue if there exists a nonzero row vector y such that yA=yλ.

If R is commutative, the left eigenvalues of A are exactly the right eigenvalues of A and are just called eigenvalues. If R is not commutative, e.g. quaternions, they may be different.

Eigenvalues of a graph

An eigenvalue of a graph is defined as an eigenvalue of the graph's adjacency matrix A, or (increasingly) of the graph's Laplacian matrix $I-T^{-1/2}AT^{-1/2}$ , where T is a diagonal matrix holding the degree of each vertex, and in $T^{-1/2}$ , 0 is substituted for $0^{-1/2}$ .

Generalized eigenvalue problem

A generalized eigenvalue problem is of the form

Av=\lambda Bv\quad \quad (1)

where A and B are matrices (with complex entries). The generalized eigenvalues λ can be obtained by solving the equation

\det(A-\lambda B)=0.\,

If B is invertible, then problem (1) can be obviously written in the form

B^{-1}Av=\lambda v\quad \quad (2)

which is a standard eigenvalue problem. However, in most situations it is preferable not to perform the inversion, and solve the generalized eigenvalue problem as stated originally.

If A and B are symmetric matrices with real entries, then problem (1) has real eigenvalues. This would have not been easy to see from the equivalent formulation (2), because the matrix $B^{-1}A$ is not necessarily symmetric if A and B are.

External links

Videos of MIT Linear Algebra Course, fall 1999 - See Lecture #21: Eigenvalues and Eigenvectors
MathWorld: Eigenvector
Earliest Known Uses of Some of the Words of Mathematics: E - see eigenvector and related terms
"Eigenvalue (of a matrix)". PlanetMath.

References

Roger A. Horn and Charles R. Johnson, Matrix Analysis, Cambridge University Press, 1985. ISBN 0-521-30586-1 (hardback), ISBN 0-521-38632-2 (paperback).