Jump to content

Einstein notation: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Citation bot (talk | contribs)
Removed parameters. | You can use this bot yourself. Report bugs here. | Activated by Amigao | Category:Mathematical notation | via #UCB_Category
 
(46 intermediate revisions by 36 users not shown)
Line 1: Line 1:
{{Short description|Shorthand notation for tensor operations}}
{{Short description|Shorthand notation for tensor operations}}


In [[mathematics]], especially in applications of [[linear algebra]] to [[physics]], the '''Einstein notation''' or '''Einstein summation convention''' is a notational convention that implies summation over a set of indexed terms in a formula, thus achieving notational brevity. As part of mathematics it is a notational subset of [[Ricci calculus]]; however, it is often used in applications in physics that do not distinguish between [[Tangent space|tangent]] and [[cotangent space|cotangent]] spaces. It was introduced to physics by [[Albert Einstein]] in 1916.<ref name=Ein1916>{{cite journal|last=Einstein |first=Albert |authorlink=Albert Einstein |title=The Foundation of the General Theory of Relativity |journal=Annalen der Physik |year=1916 |url=http://www.alberteinstein.info/gallery/gtext3.html |doi=10.1002/andp.19163540702 |format=[[PDF]] |accessdate=2006-09-03 |archiveurl=https://web.archive.org/web/20060829045130/http://www.alberteinstein.info/gallery/gtext3.html |archivedate=2006-08-29 |bibcode=1916AnP...354..769E |url-status=dead }}</ref>
In [[mathematics]], especially the usage of [[linear algebra]] in [[mathematical physics]] and [[differential geometry]], '''Einstein notation''' (also known as the '''Einstein summation convention''' or '''Einstein summation notation''') is a notational convention that implies [[summation]] over a set of indexed terms in a formula, thus achieving brevity. As part of mathematics it is a notational subset of [[Ricci calculus]]; however, it is often used in physics applications that do not distinguish between [[Tangent space|tangent]] and [[cotangent space]]s. It was introduced to physics by [[Albert Einstein]] in 1916.<ref name=Ein1916>{{cite journal|last=Einstein |first=Albert |author-link=Albert Einstein |title=The Foundation of the General Theory of Relativity |journal=Annalen der Physik |year=1916 |volume=354 |issue=7 |page=769 |url=http://www.alberteinstein.info/gallery/gtext3.html |doi=10.1002/andp.19163540702 |format=[[PDF]] |access-date=2006-09-03 |archive-url=https://web.archive.org/web/20060829045130/http://www.alberteinstein.info/gallery/gtext3.html |archive-date=2006-08-29 | bibcode=1916AnP...354..769E |url-status = dead }}</ref>


== Introduction ==
== Introduction ==
Line 7: Line 7:
===Statement of convention===
===Statement of convention===


According to this convention, when an index variable appears twice in a single term and is not otherwise defined (see [[free and bound variables]]), it implies summation of that term over all the values of the index. So where the indices can range over the [[Set (mathematics)|set]] {{math|{1, 2, 3}|}},
According to this convention, when an index variable appears twice in a single [[Addend|term]] and is not otherwise defined (see [[Free and bound variables]]), it implies summation of that term over all the values of the index. So where the indices can range over the [[Set (mathematics)|set]] {{math|{1, 2, 3}<nowiki/>}},
<math display="block">y = \sum_{i = 1}^3 x^i e_i = x^1 e_1 + x^2 e_2 + x^3 e_3 </math>

: <math>y = \sum_{i = 1}^3 c_i x^i = c_1 x^1 + c_2 x^2 + c_3 x^3</math>

is simplified by the convention to:
is simplified by the convention to:
<math display="block">y = x^i e_i </math>


The upper indices are not [[Exponentiation|exponents]] but are indices of coordinates, [[coefficient]]s or [[basis vector]]s. That is, in this context {{math|''x''<sup>2</sup>}} should be understood as the second component of {{math|''x''}} rather than the square of {{math|''x''}} (this can occasionally lead to ambiguity). The upper index position in {{math|''x''<sup>''i''</sup>}} is because, typically, an index occurs once in an upper (superscript) and once in a lower (subscript) position in a term (see ''{{section link|#Application}}'' below). Typically, {{math|(''x''<sup>1</sup> ''x''<sup>2</sup> ''x''<sup>3</sup>)}} would be equivalent to the traditional {{math|(''x'' ''y'' ''z'')}}.
: <math>y = c_i x^i.</math>

The upper indices are not [[Exponentiation|exponents]] but are indices of coordinates, [[coefficient]]s or [[basis vector]]s. That is, in this context {{math|''x''<sup>2</sup>}} should be understood as the second component of {{math|'''x'''}} rather than the square of {{math|'''x'''}} (this can occasionally lead to ambiguity). The upper index position in {{math|''x''<sup>''i''</sup>}} is because, typically, an index occurs once in an upper (superscript) and once in a lower (subscript) position in a term (see ''{{section link|#Application}}'' below). Typically, {{math|(''x''<sup>1</sup> ''x''<sup>2</sup> ''x''<sup>3</sup>)}} would be equivalent to the traditional {{math|(''x'' ''y'' ''z'')}}.


In [[general relativity]], a common convention is that
In [[general relativity]], a common convention is that
* the [[Greek alphabet]] is used for space and time components, where indices take on values 0, 1, 2, or 3 (frequently used letters are {{math|''μ'', ''ν'', ...}}),
* the [[Greek alphabet]] is used for space and time components, where indices take on values 0, 1, 2, or 3 (frequently used letters are {{math|''μ'', ''ν'', ...}}),
* the [[Latin alphabet]] is used for spatial components only, where indices take on values 1, 2, or 3 (frequently used letters are {{math|''i'', ''j'', ...}}),
* the [[Latin alphabet]] is used for spatial components only, where indices take on values 1, 2, or 3 (frequently used letters are {{math|''i'', ''j'', ...}}),


In general, indices can range over any [[Indexed family|indexing set]], including an [[infinite set]]. This should not be confused with a typographically similar convention used to distinguish between [[tensor index notation]] and the closely related but distinct basis-independent [[abstract index notation]].
In general, indices can range over any [[Indexed family|indexing set]], including an [[infinite set]]. This should not be confused with a typographically similar convention used to distinguish between [[tensor index notation]] and the closely related but distinct basis-independent [[abstract index notation]].


An index that is summed over is a ''summation index'', in this case "{{math|''i''}}". It is also called a [[bound variable|dummy index]] since any symbol can replace "{{math|''i''}}" without changing the meaning of the expression provided that it does not collide with index symbols in the same term.
An index that is summed over is a ''summation index'', in this case "{{math|''i''&hairsp;}}". It is also called a [[bound variable|dummy index]] since any symbol can replace "{{math|''i''&hairsp;}}" without changing the meaning of the expression (provided that it does not collide with other index symbols in the same term).


An index that is not summed over is a [[free variable|''free index'']] and should appear only once per term. If such an index does appear, it usually also appears in terms belonging to the same sum, with the exception of special values such as zero.
An index that is not summed over is a [[free variable|''free index'']] and should appear only once per term. If such an index does appear, it usually also appears in every other term in an equation. An example of a free index is the "{{math|''i''&hairsp;}}" in the equation <math>v_i = a_i b_j x^j</math>, which is equivalent to the equation <math display="inline">v_i = \sum_j(a_{i} b_{j} x^{j})</math>.


===Application===
===Application===


Einstein notation can be applied in slightly different ways. Typically, each index occurs once in an upper (superscript) and once in a lower (subscript) position in a term; however, the convention can be applied more generally to any repeated indices within a term.<ref name="wolfram">{{cite web |url=http://mathworld.wolfram.com/EinsteinSummation.html |title=Einstein Summation|accessdate=13 April 2011 |last= |first= |year= |publisher=Wolfram Mathworld }}</ref> When dealing with [[Covariance and contravariance of vectors|covariant and contravariant]] vectors, where the position of an index also indicates the type of vector, the first case usually applies; a covariant vector can only be contracted with a contravariant vector, corresponding to summation of the products of coefficients. On the other hand, when there is a fixed coordinate basis (or when not considering coordinate vectors), one may choose to use only subscripts; see ''{{section link||Superscripts and subscripts versus only subscripts}}'' below.
Einstein notation can be applied in slightly different ways. Typically, each index occurs once in an upper (superscript) and once in a lower (subscript) position in a term; however, the convention can be applied more generally to any repeated indices within a term.<ref name="wolfram">{{cite web |url=http://mathworld.wolfram.com/EinsteinSummation.html |title=Einstein Summation |access-date=13 April 2011 |publisher=Wolfram Mathworld }}</ref> When dealing with [[Covariance and contravariance of vectors|covariant and contravariant]] vectors, where the position of an index indicates the type of vector, the first case usually applies; a covariant vector can only be contracted with a contravariant vector, corresponding to summation of the products of coefficients. On the other hand, when there is a fixed coordinate basis (or when not considering coordinate vectors), one may choose to use only subscripts; see ''{{section link||Superscripts and subscripts versus only subscripts}}'' below.


==Vector representations==
==Vector representations==
Line 35: Line 32:
=== Superscripts and subscripts versus only subscripts ===
=== Superscripts and subscripts versus only subscripts ===


In terms of [[covariance and contravariance of vectors]],
In terms of [[covariance and contravariance of vectors]],
* upper indices represent components of [[Covariance and contravariance of vectors|contravariant vectors]] ([[coordinate vector|vector]]s),
* upper indices represent components of [[Covariance and contravariance of vectors|contravariant vectors]] ([[coordinate vector|vector]]s),
* lower indices represent components of [[covariant vector|covariant]] vectors ([[covector]]s).
* lower indices represent components of [[covariant vector|covariant]] vectors ([[covector]]s).


They transform contravariantly or covariantly, respectively, with respect to change of basis.
They transform contravariantly or covariantly, respectively, with respect to [[change of basis]].


In recognition of this fact, the following notation uses the same symbol both for a vector or covector and its ''components'', as in:
In recognition of this fact, the following notation uses the same symbol both for a vector or covector and its ''components'', as in:
<math display="block">\begin{align}
:<math>v = v^i e_i = \begin{bmatrix}e_1 & e_2 & \cdots & e_n\end{bmatrix} \begin{bmatrix}v^1 \\ v^2 \\ \vdots \\ v^n\end{bmatrix} \qquad w = w_i e^i = \begin{bmatrix}w_1 & w_2 & \cdots & w_n\end{bmatrix} \begin{bmatrix}e^1 \\ e^2 \\ \vdots \\ e^n\end{bmatrix}</math>
v = v^i e_i = \begin{bmatrix} e_1 & e_2 & \cdots & e_n \end{bmatrix} \begin{bmatrix} v^1 \\ v^2 \\ \vdots \\ v^n \end{bmatrix} \\
w = w_i e^i = \begin{bmatrix} w_1 & w_2 & \cdots & w_n \end{bmatrix} \begin{bmatrix} e^1 \\ e^2 \\ \vdots \\ e^n \end{bmatrix}
\end{align}</math>


where {{math|''v''}} is the vector and {{math|''v<sup>i</sup>''}} are its components (not the {{math|''i''}}th covector {{math|''v''}}), {{math|''w''}} is the covector and {{math|''w<sub>i</sub>''}} are its components. The basis vector elements <math> e_i </math> are each column vectors, and the covector basis elements <math> e^i </math> are each row covectors. (See also Abstract Description; [[Dual basis|duality]], below and the [[Dual basis#Examples|examples]])
where {{math|''v''}} is the vector and {{math|''v''<sup>&hairsp;''i''</sup>}} are its components (not the {{math|''i''}}th covector {{math|''v''}}), {{math|''w''}} is the covector and {{math|''w<sub>i</sub>''}} are its components. The basis vector elements <math>e_i</math> are each column vectors, and the covector basis elements <math>e^i</math> are each row covectors. (See also {{slink|#Abstract description}}; [[dual basis|duality]], below and the [[Dual basis#Examples|examples]])


In the presence of a non-degenerate form (an isomorphism {{math|''V'' → ''V''{{i sup|∗}}}}, for instance a [[Riemannian metric]] or [[Minkowski metric]]), one can [[raising and lowering indices|raise and lower indices]].
In the presence of a [[Degenerate bilinear form|non-degenerate form]] (an [[isomorphism]] {{math|''V'' → ''V''{{i sup|∗}}}}, for instance a [[Riemannian metric]] or [[Minkowski metric]]), one can [[raising and lowering indices|raise and lower indices]].


A basis gives such a form (via the [[dual basis]]), hence when working on {{math|<sup>''n''</sup>}} with a Euclidean metric and a fixed orthonormal basis, one has the option to work with only subscripts.
A basis gives such a form (via the [[dual basis]]), hence when working on {{math|'''R'''<sup>''n''</sup>}} with a [[Euclidean metric]] and a fixed [[orthonormal basis]], one has the option to work with only subscripts.


However, if one changes coordinates, the way that coefficients change depends on the variance of the object, and one cannot ignore the distinction; see [[covariance and contravariance of vectors]].
However, if one changes coordinates, the way that coefficients change depends on the variance of the object, and one cannot ignore the distinction; see [[Covariance and contravariance of vectors]].


===Mnemonics===
===Mnemonics===


In the above example, vectors are represented as {{math|''n'' × 1}} matrices (column vectors), while covectors are represented as {{math|1 × ''n''}} matrices (row covectors).
In the above example, vectors are represented as {{math|''n'' ×&thinsp;1}} [[matrix (mathematics)|matrices]] (column vectors), while covectors are represented as {{math|1&thinsp;× ''n''}} matrices (row covectors).


When using the column vector convention:
When using the column vector convention:
{{bulleted list
| "'''Up'''per indices go '''up''' to down; '''l'''ower indices go '''l'''eft to right."
| "'''Co'''variant tensors are '''row''' vectors that have indices that are '''below''' ('''co-row-below''')."
| Covectors are row vectors:
: <math>\begin{bmatrix}w_1 & \cdots & w_k\end{bmatrix}.</math>


* "'''Up'''per indices go '''up''' to down; '''l'''ower indices go '''l'''eft to right."
Hence the lower index indicates which ''column'' you are in.
* "'''Co'''variant tensors are '''row''' vectors that have indices that are '''below''' ('''co-row-below''')."
| Contravariant vectors are column vectors:
: <math>\begin{bmatrix}v^1 \\ \vdots \\ v^k\end{bmatrix}</math>
* Covectors are row vectors: <math display="block">\begin{bmatrix} w_1 & \cdots & w_k \end{bmatrix}.</math> Hence the lower index indicates which ''column'' you are in.
* Contravariant vectors are column vectors: <math display="block">\begin{bmatrix} v^1 \\ \vdots \\ v^k \end{bmatrix}</math> Hence the upper index indicates which ''row'' you are in.

Hence the upper index indicates which ''row'' you are in.
}}


=== Abstract description ===
=== Abstract description ===


The virtue of Einstein notation is that it represents the invariant quantities with a simple notation.
The virtue of Einstein notation is that it represents the [[Invariant (mathematics)|invariant]] quantities with a simple notation.


In physics, a [[Scalar (physics)|scalar]] is invariant under transformations of [[basis (mathematics)|basis]]. In particular, a [[Lorentz scalar]] is invariant under a Lorentz transformation. The individual terms in the sum are not. When the basis is changed, the ''components'' of a vector change by a linear transformation described by a matrix. This led Einstein to propose the convention that repeated indices imply the summation is to be done.
In physics, a [[Scalar (physics)|scalar]] is invariant under transformations of basis. In particular, a [[Lorentz scalar]] is invariant under a [[Lorentz transformation]]. The individual terms in the sum are not. When the basis is changed, the ''components'' of a vector change by a [[linear transformation]] described by a matrix. This led Einstein to propose the convention that repeated indices imply the summation is to be done.


As for covectors, they change by the inverse matrix. This is designed to guarantee that the linear function associated with the covector, the sum above, is the same no matter what the basis is.
As for covectors, they change by the [[inverse matrix]]. This is designed to guarantee that the linear function associated with the covector, the sum above, is the same no matter what the basis is.


The value of the Einstein convention is that it applies to other vector spaces built from {{math|''V''}} using the [[tensor product]] and [[dual space|duality]]. For example, {{math|''V'' ⊗ ''V''}}, the tensor product of {{math|''V''}} with itself, has a basis consisting of tensors of the form {{math|'''e'''<sub>''ij''</sub> {{=}} '''e'''<sub>''i''</sub> ⊗ '''e'''<sub>''j''</sub>}}. Any tensor {{math|'''T'''}} in {{math|''V'' ⊗ ''V''}} can be written as:
The value of the Einstein convention is that it applies to other [[vector space]]s built from {{math|''V''}} using the [[tensor product]] and [[dual space|duality]]. For example, {{math|''V'' ⊗&thinsp;''V''}}, the tensor product of {{math|''V''}} with itself, has a basis consisting of tensors of the form {{math|1='''e'''<sub>''ij''</sub> = '''e'''<sub>''i''</sub> ⊗ '''e'''<sub>''j''</sub>}}. Any tensor {{math|'''T'''}} in {{math|''V'' ⊗&thinsp;''V''}} can be written as:
<math display="block">\mathbf{T} = T^{ij}\mathbf{e}_{ij}.</math>

: <math>\mathbf{T} = T^{ij}\mathbf{e}_{ij}</math>.

{{math|''V''*}}, the dual of {{math|''V''}}, has a basis {{math|'''e'''<sup>1</sup>}}, {{math|'''e'''<sup>2</sup>}}, ..., {{math|'''e'''<sup>''n''}}</sup> which obeys the rule
:<math>\mathbf{e}^i (\mathbf{e}_j) = \delta^i_j.</math>


{{math|''V''&hairsp;*}}, the dual of {{math|''V''}}, has a basis {{math|'''e'''<sup>1</sup>}}, {{math|'''e'''<sup>2</sup>}}, ..., {{math|'''e'''<sup>''n''</sup>}} which obeys the rule
<math display="block">\mathbf{e}^i (\mathbf{e}_j) = \delta^i_j.</math>
where {{math|''δ''}} is the [[Kronecker delta]]. As
where {{math|''δ''}} is the [[Kronecker delta]]. As
: <math>\operatorname{Hom}(V,W) = V^* \otimes W</math>
<math display="block">\operatorname{Hom}(V, W) = V^* \otimes W</math>

the row/column coordinates on a matrix correspond to the upper/lower indices on the tensor product.
the row/column coordinates on a matrix correspond to the upper/lower indices on the tensor product.


== Common operations in this notation ==
== Common operations in this notation ==


In Einstein notation, the usual element reference {{math|''A<sub>mn</sub>''}} for the {{math|''m''}}th row and {{math|''n''}}th column of matrix {{math|'''A'''}} becomes {{math|''A<sup>m</sup><sub>n</sub>''}}. We can then write the following operations in Einstein notation as follows.
In Einstein notation, the usual element reference <math>A_{mn}</math> for the <math>m</math>-th row and <math>n</math>-th column of matrix <math>A</math> becomes <math>{A^m}_{n}</math>. We can then write the following operations in Einstein notation as follows.


=== [[Inner product]] (hence also [[vector dot product]]) ===
=== Inner product ===


Using an [[orthogonal basis]], the inner product is the sum of corresponding components multiplied together:
The [[inner product]] of two vectors is the sum of the products of their corresponding components, with the indices of one vector lowered (see [[#Raising and lowering indices]]):
<math display="block">\langle\mathbf u,\mathbf v\rangle = \langle\mathbf e_i, \mathbf e_j\rangle u^i v^j = u_j v^j</math>
In the case of an [[orthonormal basis]], we have <math>u^j = u_j</math>, and the expression simplifies to:
<math display="block">\langle\mathbf u,\mathbf v\rangle = \sum_j u^j v^j = u_j v^j</math>


=== Vector cross product ===
: <math> \mathbf{u} \cdot \mathbf{v} = u_j v^j </math>


In three dimensions, the [[cross product]] of two vectors with respect to a [[Orientation (vector space)|positively oriented]] orthonormal basis, meaning that <math>\mathbf e_1\times\mathbf e_2=\mathbf e_3</math>, can be expressed as:
This can also be calculated by multiplying the covector on the vector.
<math display="block">\mathbf{u} \times \mathbf{v} = \varepsilon^i_{\,jk} u^j v^k \mathbf{e}_i</math>


Here, <math>\varepsilon^i_{\,jk} = \varepsilon_{ijk}</math> is the [[Levi-Civita symbol]]. Since the basis is orthonormal, raising the index <math>i</math> does not alter the value of <math>\varepsilon_{ijk}</math>, when treated as a tensor.
=== [[Vector cross product]] ===

Again using an orthogonal basis (in 3 dimensions) the cross product intrinsically involves summations over permutations of components:

: <math> \mathbf{u} \times \mathbf{v}= \varepsilon^i{}_{jk}u^j v^k \mathbf{e}_i </math>

where

: <math>\varepsilon^i{}_{jk}=\delta^{il}\varepsilon_{ljk}</math>

{{math|''ε<sub>ijk</sub>''}} is the [[Levi-Civita symbol]], and {{math|''δ<sup>il</sup>''}} is the generalized [[Kronecker delta]]. Based on this definition of {{math|''ε''}}, there is no difference between {{math|''ε<sup>i</sup><sub>jk</sub>''}} and {{math|''ε<sub>ijk</sub>''}} but the position of indices.


=== Matrix-vector multiplication ===
=== Matrix-vector multiplication ===


The product of a matrix {{math|''A<sub>ij</sub>''}} with a column vector {{math|''v<sub>j</sub>''}} is :
The product of a matrix {{math|''A<sub>ij</sub>''}} with a column vector {{math|''v<sub>j</sub>''}} is:
<math display="block">\mathbf{u}_{i} = (\mathbf{A} \mathbf{v})_{i} = \sum_{j=1}^N A_{ij} v_{j}</math>

: <math> \mathbf{u}_{i} = (\mathbf{A} \mathbf{v})_{i} =\sum_{j=1}^N A_{ij} v_{j}</math>

equivalent to
equivalent to
<math display="block">u^i = {A^i}_j v^j </math>

: <math>u^i = A^i{}_j v^j </math>


This is a special case of matrix multiplication.
This is a special case of matrix multiplication.


=== [[Matrix multiplication]] ===
=== Matrix multiplication ===


The [[Matrix multiplication#matrix product|matrix product]] of two matrices {{math|''A<sub>ij</sub>''}} and {{math|''B<sub>jk</sub>''}} is:
The [[matrix multiplication|matrix product]] of two matrices {{math|''A<sub>ij</sub>''}} and {{math|''B<sub>jk</sub>''}} is:
<math display="block">\mathbf{C}_{ik} = (\mathbf{A} \mathbf{B})_{ik} =\sum_{j=1}^N A_{ij} B_{jk}</math>

: <math> \mathbf{C}_{ik} = (\mathbf{A} \mathbf{B})_{ik} =\sum_{j=1}^N A_{ij} B_{jk}</math>


equivalent to
equivalent to
<math display="block">{C^i}_k = {A^i}_j {B^j}_k</math>


=== Trace ===
: <math>C^i{}_k = A^i{}_j B^j{}_k </math>


For a [[square matrix]] {{math|''A<sup>i</sup><sub>j</sub>''}}, the [[Trace (linear algebra)|trace]] is the sum of the diagonal elements, hence the sum over a common index {{math|''A<sup>i</sup><sub>i</sub>''}}.
=== [[Trace (linear algebra)|Trace]] ===


=== Outer product ===
For a square matrix {{math|''A<sup>i</sup><sub>j</sub>''}}, the trace is the sum of the diagonal elements, hence the sum over a common index {{math|''A<sup>i</sup><sub>i</sub>''}}.


The [[outer product]] of the column vector {{math|''u<sup>i</sup>''}} by the row vector {{math|''v<sub>j</sub>''}} yields an {{math|''m''&thinsp;×&thinsp;''n''}} matrix {{math|'''A'''}}:
=== [[Outer product]] ===
<math display="block">{A^i}_j = u^i v_j = {(u v)^i}_j</math>

The outer product of the column vector {{math|''u<sup>i</sup>''}} by the row vector {{math|''v<sub>j</sub>''}} yields an {{math|''m'' × ''n''}} matrix {{math|'''A'''}}:

: <math>A^i{}_j = u^i v_j = (u v)^i{}_j</math>


Since {{math|''i''}} and {{math|''j''}} represent two ''different'' indices, there is no summation and the indices are not eliminated by the multiplication.
Since {{math|''i''}} and {{math|''j''}} represent two ''different'' indices, there is no summation and the indices are not eliminated by the multiplication.


=== [[Raising and lowering indices]] ===
=== Raising and lowering indices ===

Given a tensor, one can raise an index or lower an index by contracting the tensor with the [[metric tensor]], {{math|''g<sub>μν</sub>''}}. For example, take the tensor {{math|''T<sup>α</sup><sub>β</sub>''}}, one can raise an index:

: <math>T^{\mu\alpha} = g^{\mu\sigma} T_{\sigma}{}^{\alpha}</math>


Given a [[tensor]], one can [[Raising and lowering indices|raise an index or lower an index]] by contracting the tensor with the [[metric tensor]], {{math|''g<sub>μν</sub>''}}. For example, taking the tensor {{math|''T<sup>α</sup><sub>β</sub>''}}, one can lower an index:
Or one can lower an index:
<math display="block">g_{\mu\sigma} {T^\sigma}_\beta = T_{\mu\beta}</math>


Or one can raise an index:
: <math>T_{\mu\beta} = g_{\mu\sigma} T^{\sigma}{}_{\beta}</math>
<math display="block">g^{\mu\sigma} {T_\sigma}^\alpha = T^{\mu\alpha}</math>


== See also ==
== See also ==
Line 176: Line 152:


==External links==
==External links==
{{wikibooks|General relativity|Einstein Summation Notation}}
{{wikibooks|General Relativity|Einstein Summation Notation}}


* {{cite news
* {{cite news
Line 189: Line 165:
|url-status=dead
|url-status=dead
}}
}}
* {{cite web |title=Vector Calculation in Index Notation (Einstein's Summation Convention) |url=https://www.goldsilberglitzer.at/Rezepte/Rezept004E.pdf}}
* {{cite web |title=Understanding NumPy's einsum |url=https://stackoverflow.com/a/33641428 |website=Stack Overflow}}


{{tensors}}
{{tensors}}

Latest revision as of 21:20, 21 November 2024

In mathematics, especially the usage of linear algebra in mathematical physics and differential geometry, Einstein notation (also known as the Einstein summation convention or Einstein summation notation) is a notational convention that implies summation over a set of indexed terms in a formula, thus achieving brevity. As part of mathematics it is a notational subset of Ricci calculus; however, it is often used in physics applications that do not distinguish between tangent and cotangent spaces. It was introduced to physics by Albert Einstein in 1916.[1]

Introduction

[edit]

Statement of convention

[edit]

According to this convention, when an index variable appears twice in a single term and is not otherwise defined (see Free and bound variables), it implies summation of that term over all the values of the index. So where the indices can range over the set {1, 2, 3}, is simplified by the convention to:

The upper indices are not exponents but are indices of coordinates, coefficients or basis vectors. That is, in this context x2 should be understood as the second component of x rather than the square of x (this can occasionally lead to ambiguity). The upper index position in xi is because, typically, an index occurs once in an upper (superscript) and once in a lower (subscript) position in a term (see § Application below). Typically, (x1 x2 x3) would be equivalent to the traditional (x y z).

In general relativity, a common convention is that

  • the Greek alphabet is used for space and time components, where indices take on values 0, 1, 2, or 3 (frequently used letters are μ, ν, ...),
  • the Latin alphabet is used for spatial components only, where indices take on values 1, 2, or 3 (frequently used letters are i, j, ...),

In general, indices can range over any indexing set, including an infinite set. This should not be confused with a typographically similar convention used to distinguish between tensor index notation and the closely related but distinct basis-independent abstract index notation.

An index that is summed over is a summation index, in this case "i". It is also called a dummy index since any symbol can replace "i" without changing the meaning of the expression (provided that it does not collide with other index symbols in the same term).

An index that is not summed over is a free index and should appear only once per term. If such an index does appear, it usually also appears in every other term in an equation. An example of a free index is the "i" in the equation , which is equivalent to the equation .

Application

[edit]

Einstein notation can be applied in slightly different ways. Typically, each index occurs once in an upper (superscript) and once in a lower (subscript) position in a term; however, the convention can be applied more generally to any repeated indices within a term.[2] When dealing with covariant and contravariant vectors, where the position of an index indicates the type of vector, the first case usually applies; a covariant vector can only be contracted with a contravariant vector, corresponding to summation of the products of coefficients. On the other hand, when there is a fixed coordinate basis (or when not considering coordinate vectors), one may choose to use only subscripts; see § Superscripts and subscripts versus only subscripts below.

Vector representations

[edit]

Superscripts and subscripts versus only subscripts

[edit]

In terms of covariance and contravariance of vectors,

They transform contravariantly or covariantly, respectively, with respect to change of basis.

In recognition of this fact, the following notation uses the same symbol both for a vector or covector and its components, as in:

where v is the vector and vi are its components (not the ith covector v), w is the covector and wi are its components. The basis vector elements are each column vectors, and the covector basis elements are each row covectors. (See also § Abstract description; duality, below and the examples)

In the presence of a non-degenerate form (an isomorphism VV, for instance a Riemannian metric or Minkowski metric), one can raise and lower indices.

A basis gives such a form (via the dual basis), hence when working on Rn with a Euclidean metric and a fixed orthonormal basis, one has the option to work with only subscripts.

However, if one changes coordinates, the way that coefficients change depends on the variance of the object, and one cannot ignore the distinction; see Covariance and contravariance of vectors.

Mnemonics

[edit]

In the above example, vectors are represented as n × 1 matrices (column vectors), while covectors are represented as 1 × n matrices (row covectors).

When using the column vector convention:

  • "Upper indices go up to down; lower indices go left to right."
  • "Covariant tensors are row vectors that have indices that are below (co-row-below)."
  • Covectors are row vectors: Hence the lower index indicates which column you are in.
  • Contravariant vectors are column vectors: Hence the upper index indicates which row you are in.

Abstract description

[edit]

The virtue of Einstein notation is that it represents the invariant quantities with a simple notation.

In physics, a scalar is invariant under transformations of basis. In particular, a Lorentz scalar is invariant under a Lorentz transformation. The individual terms in the sum are not. When the basis is changed, the components of a vector change by a linear transformation described by a matrix. This led Einstein to propose the convention that repeated indices imply the summation is to be done.

As for covectors, they change by the inverse matrix. This is designed to guarantee that the linear function associated with the covector, the sum above, is the same no matter what the basis is.

The value of the Einstein convention is that it applies to other vector spaces built from V using the tensor product and duality. For example, V ⊗ V, the tensor product of V with itself, has a basis consisting of tensors of the form eij = eiej. Any tensor T in V ⊗ V can be written as:

V *, the dual of V, has a basis e1, e2, ..., en which obeys the rule where δ is the Kronecker delta. As the row/column coordinates on a matrix correspond to the upper/lower indices on the tensor product.

Common operations in this notation

[edit]

In Einstein notation, the usual element reference for the -th row and -th column of matrix becomes . We can then write the following operations in Einstein notation as follows.

Inner product

[edit]

The inner product of two vectors is the sum of the products of their corresponding components, with the indices of one vector lowered (see #Raising and lowering indices): In the case of an orthonormal basis, we have , and the expression simplifies to:

Vector cross product

[edit]

In three dimensions, the cross product of two vectors with respect to a positively oriented orthonormal basis, meaning that , can be expressed as:

Here, is the Levi-Civita symbol. Since the basis is orthonormal, raising the index does not alter the value of , when treated as a tensor.

Matrix-vector multiplication

[edit]

The product of a matrix Aij with a column vector vj is: equivalent to

This is a special case of matrix multiplication.

Matrix multiplication

[edit]

The matrix product of two matrices Aij and Bjk is:

equivalent to

Trace

[edit]

For a square matrix Aij, the trace is the sum of the diagonal elements, hence the sum over a common index Aii.

Outer product

[edit]

The outer product of the column vector ui by the row vector vj yields an m × n matrix A:

Since i and j represent two different indices, there is no summation and the indices are not eliminated by the multiplication.

Raising and lowering indices

[edit]

Given a tensor, one can raise an index or lower an index by contracting the tensor with the metric tensor, gμν. For example, taking the tensor Tαβ, one can lower an index:

Or one can raise an index:

See also

[edit]

Notes

[edit]
  1. This applies only for numerical indices. The situation is the opposite for abstract indices. Then, vectors themselves carry upper abstract indices and covectors carry lower abstract indices, as per the example in the introduction of this article. Elements of a basis of vectors may carry a lower numerical index and an upper abstract index.

References

[edit]
  1. ^ Einstein, Albert (1916). "The Foundation of the General Theory of Relativity". Annalen der Physik. 354 (7): 769. Bibcode:1916AnP...354..769E. doi:10.1002/andp.19163540702. Archived from the original (PDF) on 2006-08-29. Retrieved 2006-09-03.
  2. ^ "Einstein Summation". Wolfram Mathworld. Retrieved 13 April 2011.

Bibliography

[edit]
[edit]