Schur product theorem: Difference between revisions
m →Proof: \frac |
No edit summary |
||
(19 intermediate revisions by 12 users not shown) | |||
Line 1: | Line 1: | ||
In [[mathematics]], particularly in [[linear algebra]], the '''Schur product theorem''' states that the [[Hadamard product (matrices)|Hadamard product]] of two [[positive definite matrices]] is also a positive definite matrix. |
In [[mathematics]], particularly in [[linear algebra]], the '''Schur product theorem''' states that the [[Hadamard product (matrices)|Hadamard product]] of two [[positive definite matrices]] is also a positive definite matrix. |
||
The result is named after [[Issai Schur]]<ref name="Sch1911">{{Cite journal | doi = 10.1515/crll.1911.140.1 | title = Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen | journal = Journal für die reine und angewandte Mathematik | volume = 1911 | issue = 140 | pages = 1–28| year = 1911 | last1 = Schur | first1 = J. | s2cid = 120411177 }}</ref> (Schur 1911, p. 14, Theorem VII) (note that Schur signed as J. Schur in ''Journal für die reine und angewandte Mathematik''.<ref>{{Cite book | editor1-last = Zhang | editor1-first = Fuzhen | title = The Schur Complement and Its Applications | doi = 10.1007/b105056 | series = Numerical Methods and Algorithms | volume = 4 | year = 2005 | isbn = 0-387-24271-6 }}, page 9, Ch. 0.6 ''Publication under J. Schur''</ref><ref>{{Cite journal | last1 = Ledermann | first1 = W. | title = Issai Schur and His School in Berlin | doi = 10.1112/blms/15.2.97 | journal = Bulletin of the London Mathematical Society | volume = 15 | issue = 2 | pages = 97–106 | year = 1983 }}</ref>) |
|||
The converse of the theorem holds in the following sense: if <math>M</math> is a symmetric matrix and the Hadamard product <math>M \circ N</math> is positive definite for all positive definite matrices <math>N</math>, then <math>M</math> itself is positive definite. |
|||
== Proof == |
== Proof == |
||
Line 6: | Line 9: | ||
For any matrices <math>M</math> and <math>N</math>, the Hadamard product <math>M \circ N</math> considered as a bilinear form acts on vectors <math>a, b</math> as |
For any matrices <math>M</math> and <math>N</math>, the Hadamard product <math>M \circ N</math> considered as a bilinear form acts on vectors <math>a, b</math> as |
||
: <math>a^* (M \circ N) b = \operatorname{tr}(M^T \operatorname{diag}(a^*) N \operatorname{diag}(b))</math> |
: <math>a^* (M \circ N) b = \operatorname{tr}\left(M^\textsf{T} \operatorname{diag}\left(a^*\right) N \operatorname{diag}(b)\right)</math> |
||
where <math>\operatorname{tr}</math> is the matrix [[Trace (linear algebra)|trace]] and <math>\operatorname{diag}(a)</math> is the [[diagonal matrix]] having as diagonal entries the elements of <math>a</math>. |
|||
where <math>\operatorname{tr}</math> is the matrix [[trace (linear algebra)|trace]] and <math>\operatorname{diag}(a)</math> is the [[diagonal matrix]] having as diagonal entries the elements of <math>a</math>. |
|||
: <math>\operatorname{tr}(M^T \operatorname{diag}(a^*) N \operatorname{diag}(b)) = \operatorname{tr}(\overline{M}^{\frac 1 2} \overline{M}^{\frac 1 2} \operatorname{diag}(a^*) N^{\frac 1 2} N^{\frac 1 2} \operatorname{diag}(b)) = \operatorname{tr}(\overline{M}^{\frac 1 2} \operatorname{diag}(a^*) N^{\frac 1 2} N^{\frac 1 2} \operatorname{diag}(b) \overline{M}^{\frac 1 2})</math> |
|||
Suppose <math>M</math> and <math>N</math> are positive definite, and so [[Hermitian matrix|Hermitian]]. We can consider their square-roots <math>M^\frac{1}{2}</math> and <math>N^\frac{1}{2}</math>, which are also Hermitian, and write |
|||
: <math> |
|||
⚫ | |||
\operatorname{tr}\left(M^\textsf{T} \operatorname{diag}\left(a^*\right) N \operatorname{diag}(b)\right) = |
|||
\operatorname{tr}\left(\overline{M}^\frac{1}{2} \overline{M}^\frac{1}{2} \operatorname{diag}\left(a^*\right) N^\frac{1}{2} N^\frac{1}{2} \operatorname{diag}(b)\right) = |
|||
\operatorname{tr}\left(\overline{M}^\frac{1}{2} \operatorname{diag}\left(a^*\right) N^\frac{1}{2} N^\frac{1}{2} \operatorname{diag}(b) \overline{M}^\frac{1}{2}\right) |
|||
</math> |
|||
⚫ | Then, for <math>a = b</math>, this is written as <math>\operatorname{tr}\left(A^* A\right)</math> for <math>A = N^\frac{1}{2} \operatorname{diag}(a) \overline{M}^\frac{1}{2}</math> and thus is strictly positive for <math>A \neq 0</math>, which occurs if and only if <math>a \neq 0</math>. This shows that <math>(M \circ N)</math> is a positive definite matrix. |
||
=== Proof using Gaussian integration === |
=== Proof using Gaussian integration === |
||
Line 18: | Line 26: | ||
==== Case of ''M'' = ''N'' ==== |
==== Case of ''M'' = ''N'' ==== |
||
Let <math>X</math> be an <math>n</math>-dimensional centered [[Gaussian random variable]] with [[covariance]] <math>\langle X_i X_j \rangle = M_{ij}</math>. |
Let <math>X</math> be an <math>n</math>-dimensional centered [[Gaussian random variable]] with [[covariance]] <math>\langle X_i X_j \rangle = M_{ij}</math>. Then the covariance matrix of <math>X_i^2</math> and <math>X_j^2</math> is |
||
Then the covariance matrix of <math>X_i^2</math> and <math>X_j^2</math> is |
|||
: <math>\operatorname{Cov}(X_i^2, X_j^2) = \langle X_i^2 X_j^2 \rangle - \langle X_i^2 \rangle \langle X_j^2 \rangle</math> |
: <math>\operatorname{Cov}\left(X_i^2, X_j^2\right) = \left\langle X_i^2 X_j^2 \right\rangle - \left\langle X_i^2 \right\rangle \left\langle X_j^2 \right\rangle</math> |
||
Using [[Isserlis' theorem|Wick's theorem]] to develop <math>\langle X_i^2 X_j^2 \rangle = 2 \langle X_i X_j \rangle^2 + \langle X_i^2 \rangle \langle X_j^2 \rangle</math> we have |
Using [[Isserlis' theorem|Wick's theorem]] to develop <math>\left\langle X_i^2 X_j^2 \right\rangle = 2 \left\langle X_i X_j \right\rangle^2 + \left\langle X_i^2 \right\rangle \left\langle X_j^2 \right\rangle</math> we have |
||
: <math>\operatorname{Cov}(X_i^2, X_j^2) = 2 \langle X_i X_j \rangle^2 = 2 M_{ij}^2</math> |
: <math>\operatorname{Cov}\left(X_i^2, X_j^2\right) = 2 \left\langle X_i X_j \right\rangle^2 = 2 M_{ij}^2</math> |
||
Since a covariance matrix is positive definite, this proves that the matrix with elements <math>M_{ij}^2</math> is a positive definite matrix. |
Since a covariance matrix is positive definite, this proves that the matrix with elements <math>M_{ij}^2</math> is a positive definite matrix. |
||
Line 31: | Line 38: | ||
==== General case ==== |
==== General case ==== |
||
Let <math>X</math> and <math>Y</math> be <math>n</math>-dimensional centered [[Gaussian random variable]]s with [[covariance]]s <math>\langle X_i X_j \rangle = M_{ij}</math>, <math>\langle Y_i Y_j \rangle = N_{ij}</math> and independent from each other so that we have |
Let <math>X</math> and <math>Y</math> be <math>n</math>-dimensional centered [[Gaussian random variable]]s with [[covariance]]s <math>\left\langle X_i X_j \right\rangle = M_{ij}</math>, <math>\left\langle Y_i Y_j \right\rangle = N_{ij}</math> and independent from each other so that we have |
||
: <math>\langle X_i Y_j \rangle = 0</math> for any <math>i, j</math> |
: <math>\left\langle X_i Y_j \right\rangle = 0</math> for any <math>i, j</math> |
||
Then the covariance matrix of <math>X_i Y_i</math> and <math>X_j Y_j</math> is |
Then the covariance matrix of <math>X_i Y_i</math> and <math>X_j Y_j</math> is |
||
: <math>\operatorname{Cov}(X_i Y_i, X_j Y_j) = \langle X_i Y_i X_j Y_j \rangle - \langle X_i Y_i \rangle \langle X_j Y_j \rangle</math> |
: <math>\operatorname{Cov}\left(X_i Y_i, X_j Y_j\right) = \left\langle X_i Y_i X_j Y_j \right\rangle - \left\langle X_i Y_i \right\rangle \left\langle X_j Y_j \right\rangle</math> |
||
Using [[Wick's theorem]] to develop |
Using [[Isserlis' theorem|Wick's theorem]] to develop |
||
: <math>\langle X_i Y_i X_j Y_j \rangle = \langle X_i X_j \rangle \langle Y_i Y_j \rangle + |
: <math>\left\langle X_i Y_i X_j Y_j \right\rangle = \left\langle X_i X_j \right\rangle \left\langle Y_i Y_j \right\rangle + \left\langle X_i Y_i \right\rangle \left\langle X_j Y_j \right\rangle + \left\langle X_i Y_j \right\rangle \left\langle X_j Y_i \right\rangle</math> |
||
and also using the independence of <math>X</math> and <math>Y</math>, we have |
and also using the independence of <math>X</math> and <math>Y</math>, we have |
||
: <math>\operatorname{Cov}(X_i Y_i, X_j Y_j) = \langle X_i X_j \rangle \langle Y_i Y_j \rangle = M_{ij} N_{ij}</math> |
: <math>\operatorname{Cov}\left(X_i Y_i, X_j Y_j\right) = \left\langle X_i X_j \right\rangle \left\langle Y_i Y_j \right\rangle = M_{ij} N_{ij}</math> |
||
Since a covariance matrix is positive definite, this proves that the matrix with elements <math>M_{ij} N_{ij}</math> is a positive definite matrix. |
Since a covariance matrix is positive definite, this proves that the matrix with elements <math>M_{ij} N_{ij}</math> is a positive definite matrix. |
||
Line 45: | Line 56: | ||
==== Proof of positive semidefiniteness ==== |
==== Proof of positive semidefiniteness ==== |
||
Let <math>M = \sum \mu_i m_i m_i^T</math> and <math>N = \sum \nu_i n_i n_i^T</math>. Then |
Let <math>M = \sum \mu_i m_i m_i^\textsf{T}</math> and <math>N = \sum \nu_i n_i n_i^\textsf{T}</math>. Then |
||
: <math>M \circ N = \sum_{ij} \mu_i \nu_j (m_i m_i^T) \circ (n_j n_j^T) = \sum_{ij} \mu_i \nu_j (m_i \circ n_j) (m_i \circ n_j)^T</math> |
: <math>M \circ N = \sum_{ij} \mu_i \nu_j \left(m_i m_i^\textsf{T}\right) \circ \left(n_j n_j^\textsf{T}\right) = \sum_{ij} \mu_i \nu_j \left(m_i \circ n_j\right) \left(m_i \circ n_j\right)^\textsf{T}</math> |
||
Each <math>(m_i \circ n_j) (m_i \circ n_j)^T</math> is positive semidefinite (but, except in the 1-dimensional case, not positive definite, since they are [[ |
Each <math>\left(m_i \circ n_j\right) \left(m_i \circ n_j\right)^\textsf{T}</math> is positive semidefinite (but, except in the 1-dimensional case, not positive definite, since they are [[rank (linear algebra)|rank]] 1 matrices). Also, <math>\mu_i \nu_j > 0</math> thus the sum <math>M \circ N</math> is also positive semidefinite. |
||
==== Proof of definiteness ==== |
==== Proof of definiteness ==== |
||
To show that the result is positive definite requires further proof. We shall show that for any vector <math>a \neq 0</math>, we have <math>a^T (M \circ N) a > 0</math>. Continuing as above, each <math>a^T (m_i \circ n_j) (m_i \circ n_j)^T a \ge 0</math>, so it remains to show that there exist <math>i</math> and <math>j</math> for which |
To show that the result is positive definite requires even further proof. We shall show that for any vector <math>a \neq 0</math>, we have <math>a^\textsf{T} (M \circ N) a > 0</math>. Continuing as above, each <math>a^\textsf{T} \left(m_i \circ n_j\right) \left(m_i \circ n_j\right)^\textsf{T} a \ge 0</math>, so it remains to show that there exist <math>i</math> and <math>j</math> for which corresponding term above is nonzero. For this we observe that |
||
⚫ | |||
⚫ | Since <math>N</math> is positive definite, there is a <math>j</math> for which <math>n_j \circ a \neq 0</math> (since otherwise <math>n_j^\textsf{T} a = \sum_k (n_j \circ a)_k = 0</math> for all <math>j</math>), and likewise since <math>M</math> is positive definite there exists an <math>i</math> for which <math>\sum_k m_{i,k} (n_j \circ a)_k = m_i^\textsf{T} (n_j \circ a) \neq 0.</math> However, this last sum is just <math>\sum_k m_{i,k} n_{j,k} a_k</math>. Thus its square is positive. This completes the proof. |
||
⚫ | |||
⚫ | Since <math>N</math> is positive definite, there is a <math>j</math> for which <math> |
||
== References == |
== References == |
||
{{reflist}} |
{{reflist}} |
||
Line 66: | Line 76: | ||
[[Category:Linear algebra]] |
[[Category:Linear algebra]] |
||
[[Category:Matrix theory]] |
[[Category:Matrix theory]] |
||
[[Category:Issai Schur]] |
Latest revision as of 19:09, 23 November 2024
In mathematics, particularly in linear algebra, the Schur product theorem states that the Hadamard product of two positive definite matrices is also a positive definite matrix. The result is named after Issai Schur[1] (Schur 1911, p. 14, Theorem VII) (note that Schur signed as J. Schur in Journal für die reine und angewandte Mathematik.[2][3])
The converse of the theorem holds in the following sense: if is a symmetric matrix and the Hadamard product is positive definite for all positive definite matrices , then itself is positive definite.
Proof
[edit]Proof using the trace formula
[edit]For any matrices and , the Hadamard product considered as a bilinear form acts on vectors as
where is the matrix trace and is the diagonal matrix having as diagonal entries the elements of .
Suppose and are positive definite, and so Hermitian. We can consider their square-roots and , which are also Hermitian, and write
Then, for , this is written as for and thus is strictly positive for , which occurs if and only if . This shows that is a positive definite matrix.
Proof using Gaussian integration
[edit]Case of M = N
[edit]Let be an -dimensional centered Gaussian random variable with covariance . Then the covariance matrix of and is
Using Wick's theorem to develop we have
Since a covariance matrix is positive definite, this proves that the matrix with elements is a positive definite matrix.
General case
[edit]Let and be -dimensional centered Gaussian random variables with covariances , and independent from each other so that we have
- for any
Then the covariance matrix of and is
Using Wick's theorem to develop
and also using the independence of and , we have
Since a covariance matrix is positive definite, this proves that the matrix with elements is a positive definite matrix.
Proof using eigendecomposition
[edit]Proof of positive semidefiniteness
[edit]Let and . Then
Each is positive semidefinite (but, except in the 1-dimensional case, not positive definite, since they are rank 1 matrices). Also, thus the sum is also positive semidefinite.
Proof of definiteness
[edit]To show that the result is positive definite requires even further proof. We shall show that for any vector , we have . Continuing as above, each , so it remains to show that there exist and for which corresponding term above is nonzero. For this we observe that
Since is positive definite, there is a for which (since otherwise for all ), and likewise since is positive definite there exists an for which However, this last sum is just . Thus its square is positive. This completes the proof.
References
[edit]- ^ Schur, J. (1911). "Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen". Journal für die reine und angewandte Mathematik. 1911 (140): 1–28. doi:10.1515/crll.1911.140.1. S2CID 120411177.
- ^ Zhang, Fuzhen, ed. (2005). The Schur Complement and Its Applications. Numerical Methods and Algorithms. Vol. 4. doi:10.1007/b105056. ISBN 0-387-24271-6., page 9, Ch. 0.6 Publication under J. Schur
- ^ Ledermann, W. (1983). "Issai Schur and His School in Berlin". Bulletin of the London Mathematical Society. 15 (2): 97–106. doi:10.1112/blms/15.2.97.