Eigenvalues and eigenvectors
- Eigenvalue, eigenvector and generalized eigenvector will be merged in here. This article is a work in progress.
In linear algebra, an eigenvector of a linear transformation is an element in the domain of the linear transformation sent to a scalar multiple of itself; this scalar is called an eigenvalue of the linear transformation. An eigenspace of the linear transformation is a subspace of the domain all of whose elements are eigenvectors associated with the same eigenvalue. The motivation for introducing eigenvalues and eigenvectors is that they simplify many problems in linear algebra: they often allow one to replace a matrix or linear transformation (a complicated object) by a scalar (a much simpler object).
In applied mathematics and physics, the eigenvectors of a matrix or a differential operator often have important physical significance. In classical mechanics, the eigenvectors of the governing equations typically correspond to natural modes of vibration in a body, and the eigenvalues to their frequencies. In quantum mechanics, operators correspond to observables, eigenvectors are also called eigenstates, and the eigenvalues of an operator represent those values of the corresponding variable that have non-zero probability of being measured.
The word eigen is German for "own", "peculiar", or "individual": the most likely translation into English mathematical jargon would be "characteristic", and some older references do use the expressions "characteristic value", "characteristic vector" and so forth, but the more distinctive term "eigenvalue" has now become standard. The related "characteristic polynomial", however, retains this name.
Definition
Formally, we define eigenvectors and eigenvalues as follows. Let A be an n-by-n matrix of real numbers or complex numbers (see below for generalizations). We say that a complex number λ is an eigenvalue of A with eigenvector v ∈ Cn if v is not zero and
The spectrum of A, denoted σ(A), is the set of all eigenvalues.
If A : V → V is a linear operator on some vector space V, v is a non-zero vector in V and λ is a scalar (possibly zero) such that
then we say that v is an eigenvector of the operator A, and its associated eigenvalue is λ. Note that if v is an eigenvector with eigenvalue λ, then any non-zero multiple of v is also an eigenvector with eigenvalue λ. In fact, all the eigenvectors with associated eigenvalue λ, together with 0, form a subspace of V, the eigenspace for the eigenvalue λ.
Since a vector v satisfies , or equivalently , if and only if it is either zero or an eigenvector for λ, we can also define the eigenspace by saying that it is the nullspace of the matrix . Here denotes the identity matrix, .
Examples
Some special cases of linear transformations of two-dimensional space R2 are illuminating:
- rotations: no real eigenvectors (complex eigenvalue, eigenvector pairs exist). Example:
- reflection: eigenvectors are perpendicular and parallel to the line of symmetry, the eigenvalues are -1 and 1, respectively. Example:
- scaling:
- uniform scaling: all vectors are eigenvectors, and the eigenvalue is the scale factor. Example:
- directional scaling: eigenvalues are the scale factor and 1
- directionally differential scaling: eigenvalues are the scale factors
- projection onto a line: vectors on the line are eigenvectors with eigenvalue 1 and vectors perpendicular to the line are eigenvectors with eigenvalue 0. Example:
Computing eigenvalues
Suppose that we want to compute the eigenvalues of a given matrix. If the matrix is small, we can compute them symbolically using the characteristic polynomial. However, this is often impossible for larger matrices, in which case we must use a numerical method.
Symbolic computations using the characteristic polynomial
An important tool for describing eigenvalues of square matrices is the characteristic polynomial: saying that λ is an eigenvalue of A is equivalent to stating that the system of linear equations (A - λidV) v = 0 (where idV is the identity matrix) has a non-zero solution v (namely an eigenvector), and so it is equivalent to the determinant det(A - λ idV) being zero. The function p(λ) = det(A - λidV) is a polynomial in λ since determinants are defined as sums of products. This is the characteristic polynomial of A: the eigenvalues of a matrix are the zeros of its characteristic polynomial.
(Sometimes the characteristic polynomial is taken to be det(λidV - A) instead, which is the same polynomial if V is even-dimensional but has the opposite sign if V is odd-dimensional. This has the slight advantage that its leading coefficient is always 1 rather than -1.)
It follows that we can compute all the eigenvalues of a matrix A by solving the equation . If A is an n-by-n matrix, then has degree n and A can therefore have at most n eigenvalues. Conversely, the fundamental theorem of algebra says that this equation has at least one solution in the complex numbers, so every matrix has at least one eigenvalue. However, this eigenvalue may be complex, even if the matrix has real coefficients: an example of a matrix with no real eigenvalues is the 90-degree rotation
whose characteristic polynomial is and so its eigenvalues must be square roots of -1.
The Cayley-Hamilton theorem states that every square matrix satisfies its own characteristic polynomial, that is, .
Eigenvalues of 2×2 matrices
An analytic solution for the eigenvalues of 2×2 matrices can be obtained directly from the quadratic formula: if
then the characteristic polynomial is
(notice that the coefficients, up to sign, are and ) so the solutions are
In principle a formula for the eigenvalues of a 3x3 or 4x4 matrix could be calculated in the same way, using the formulae for the roots of a cubic or quartic equation.
Numerical computations
Main article: eigenvalue algorithm.
In practice, eigenvalues of large matrices are not computed using the characteristic polynomial. Moreover, the Abel-Ruffini theorem implies that there is no general algorithm for finding the zeros of the characteristic polynomial. Therefore, general eigenvalues algorithms are iterative. The easiest method is the power method: we choose a random vector v and compute Av, , , ... This sequence will after normalization almost always converge to an eigenvector corresponding to the dominant eigenvalue. This algorithm is easy, but not very useful by itself. However, popular methods such as the QR algorithm are based on it.
Example computation
Let us determine the eigenvalues of the matrix
which represents a linear operator R3 -> R3.
Identifying eigenvalues
We first compute the characteristic polynomial of A:
This polynomial factors to . Therefore, the eigenvalues of A are 2, 1 and −1.
Identifying eigenvectors
With the eigenvalues in hand, we can solve sets of simultaneous linear equations to determine the corresponding eigenvectors. For example, one can check that
and therefore 2 is an eigenvalue of A and we have found a corresponding eigenvector.
Note that if A is a real matrix, the characteristic polynomial will have real coefficients, but not all its roots will necessarily be real. The complex eigenvalues come in pairs which are conjugates; they are all associated to complex eigenvectors.
In general, if v1, ..., vm are eigenvectors with different eigenvalues λ1, ..., λm, then the vectors v1, ..., vm are necessarily linearly independent.
The spectral theorem for symmetric matrices states that, if A is a real symmetric n-by-n matrix, then all its eigenvalues are real, and there exist n linearly independent eigenvectors for A which are mutually orthogonal. Symmetric matrices are commonly encountered in engineering.
Our example matrix from above is symmetric, and three mutually orthogonal eigenvectors of A are
These three vectors form a basis of R3. With respect to this basis, the linear map represented by A takes a particularly simple form: every vector x in R3 can be written uniquely as
and then we have
Multiplicity
The (algebraic) multiplicity of an eigenvalue λ of A is the order of λ as a zero of the characteristic polynomial of A; in other words, it is the number of factors t − λ in the characteristic polynomial. An n-by-n matrix has n eigenvalues, counted according to their algebraic multiplicity, because its characteristic polynomial has degree n.
An eigenvalue of algebraic multiplicity 1 is called a simple eigenvalue.
Occasionally, in an article on matrix theory, one may read a statement like
- "the eigenvalues of a matrix A are 4,4,3,3,3,2,2,1,"
meaning that the algebraic multiplicity of 4 is two, of 3 is three, of 2 is two and of 1 is one. This style is used because algebraic multiplicity is the key to many mathematical proofs in matrix theory.
The geometric multiplicity of an eigenvalue λ is the dimension of the eigenspace associated with λ, which we defined above as the nullspace of the matrix λI − A.
The algebraic multiplicity can also be thought of as a dimension: it is the dimension of the associated generalized eigenspace, which is the nullspace of the matrix (λI − A)k for any sufficiently large k. That is, it is the space of generalized eigenvectors, where a generalized eigenvector is any vector which eventually becomes 0 if λI − A is applied to it enough times successively. Obviously any eigenvector is a generalized eigenvector, and so each eigenspace is contained in the associated generalized eigenspace. This provides an easy proof that the geometric multiplicity is always less than or equal to the algebraic multiplicity. (Do not confuse these with generalized eigenvalue problem, below.)
Consider for example the matrix
It has only one eigenvalue, namely λ = 1. The characteristic polynomial is , so this eigenvalue has algebraic multiplicity 2. However, the associated eigenspace is spanned by (1, 0)T, so the geometric multiplicity is only 1.
Decomposition theorem
An n by n matrix A has n linearly independent eigenvectors if and only if it can be decomposed into the form
and Λ is a diagonal matrix with all of the eigenvalues on the diagonal. In this case A is said to be diagonalizable. The columns of U will form a basis of eigenvectors and the diagonal entries of Λ are the corresponding eigenvalues: thus, the entries of U can be chosen to be real (as opposed to complex) if and only if there are n linearly independent real eigenvectors. Even with complex coefficients, however, such a matrix U does not always exist; for example
has only one 1-dimensional eigenspace.
There are several generalizations of this decomposition which can cope with the non-diagonalizable case, suited for different purposes:
- the singular value decomposition, where is diagonal but U is not necessarily equal to V;
- the Jordan normal form, where but is not quite diagonal;
- any matrix A can be written uniquely as A = S + N where S is semisimple (i.e. diagonalizable) and N is nilpotent, and S commutes with N. This is easily found from the Jordan form.
- any invertible matrix A can be written uniquely as A = SJ where S is semisimple and J is unipotent, and S commutes with J. This is found from the previous decomposition by taking J = 1 + S-1N.
Properties
The spectrum is invariant under similarity transformations: the matrices A and P-1AP have the same eigenvalues for any matrix A and any invertible matrix P. The spectrum is also invariant under transposition: the matrices A and AT have the same eigenvalues.
A matrix is invertible if and only if zero is not an eigenvalue of the matrix.
A matrix is diagonalizable if and only if the algebraic and geometric multiplicities coincide for all its eigenvalues, which is the same as requiring that all generalized eigenvectors are eigenvectors. In particular, an n-by-n matrix which has n different eigenvalues is always diagonalizable.
The vector space on which the matrix acts is always the direct sum of the generalised eigenspaces (i.e. is spanned by them and they are independent). This is true of the ordinary (non-generalised) eigenspaces if and only if they are equal to the generalized eigenspaces, i.e. if and only if the matrix is diagonalizable.
The location of the spectrum is often restricted if the matrix has a special form:
- All eigenvalues of a Hermitian matrix (A = A*) are real. Furthermore, all eigenvalues of a positive-definite matrix (v*Av > 0 for all vectors v) are positive.
- All eigenvalues of a skew-Hermitian matrix (A = −A*) are purely imaginary.
- All eigenvalues of a unitary matrix (A-1 = A*) have absolute value one.
- The eigenvalues of a triangular matrix are the entries on the main diagonal. This holds a fortiori for diagonal matrices.
Generally, the trace of a matrix equals the sum of the eigenvalues, and the determinant equals the product of the eigenvalues (counted according to algebraic multiplicity).
Suppose that A is an m-by-n matrix, with m ≤ n, and that B is an n-by-m matrix. Then BA has the same eigenvalues as AB plus m − n eigenvalues equal to zero.
Extensions and generalizations
Infinite-dimensional spaces: eigenvalues of an operator
The concept of eigenvectors can be extended to linear operators acting on infinite-dimensional Hilbert spaces or Banach spaces.
Suppose we have a linear operator A mapping the vector space V to itself. As in the matrix case, we say that λ ∈ C is an eigenvalue of A if there exists a nonzero v ∈ V such that Av = λv.
Suppose now that A is a bounded linear operator on a Banach space V. We say that λ ∈ C is a spectral value of A if the operator is not invertible, where I denotes the identity operator. Note that by the closed graph theorem, if a bounded operator has an inverse, the inverse is necessarily bounded. The set of all spectral values is the spectrum of A.
If V is finite dimensional, then the spectrum of A is the same of the set of eigenvalues of A. This follows from the fact that on finite-dimensional spaces injectivity of a linear operator A is equivalent to surjectivity of A. However, an operator on an infinite-dimensional space may have no eigenvalues at all, while it always has spectral values.
There are operators on Banach spaces which have no eigenvectors at all. For example, take the bilateral shift on the Hilbert space ; it is easy to see that any potential eigenvector can't be square-summable, so none exist. However, any bounded linear operator on a Banach space V does have non-empty spectrum. The spectrum σ(A) of the operator A : V → V is defined as
Then σ(A) is a compact set of complex numbers, and it is non-empty. When A is a compact operator (and in particular when A is an operator between finite-dimensional spaces as above), the spectrum of A is the same as the set of its eigenvalues.
The spectrum of an operator is an important property in functional analysis.
Eigenvalues of a matrix with entries from a ring
Suppose that A is a square matrix with entries in a ring R. An element λ ∈ R is called a right eigenvalue of A if there exists a nonzero column vector x such that Ax=λx, or a left eigenvalue if there exists a nonzero row vector y such that yA=yλ.
If R is commutative, the left eigenvalues of A are exactly the right eigenvalues of A and are just called eigenvalues. If R is not commutative, e.g. quaternions, they may be different.
Eigenvalues of a graph
An eigenvalue of a graph is defined as an eigenvalue of the graph's adjacency matrix A, or (increasingly) of the graph's Laplacian matrix , where T is a diagonal matrix holding the degree of each vertex, and in , 0 is substituted for .
Generalized eigenvalue problem
A generalized eigenvalue problem is of the form
where A and B are matrices (with complex entries). The generalized eigenvalues λ can be obtained by solving the equation
If B is invertible, then problem (1) can be obviously written in the form
which is a standard eigenvalue problem. However, in most situations it is preferable not to perform the inversion, and solve the generalized eigenvalue problem as stated originally.
If A and B are symmetric matrices with real entries, then problem (1) has real eigenvalues. This would have not been easy to see from the equivalent formulation (2), because the matrix is not necessarily symmetric if A and B are.
see also Invariants of tensors
See also
External links
- Videos of MIT Linear Algebra Course, fall 1999 - See Lecture #21: Eigenvalues and Eigenvectors
- MathWorld: Eigenvector
- Earliest Known Uses of Some of the Words of Mathematics: E - see eigenvector and related terms
- "Eigenvalue (of a matrix)". PlanetMath.
References
- Roger A. Horn and Charles R. Johnson, Matrix Analysis, Cambridge University Press, 1985. ISBN 0-521-30586-1 (hardback), ISBN 0-521-38632-2 (paperback).