Schur product theorem: Difference between revisions

Content deleted Content added

Inline

Revision as of 01:51, 27 November 2013

In mathematics, particularly in linear algebra, the Schur product theorem states that the Hadamard product of two positive definite matrices is also a positive definite matrix. The result is named after Issai Schur^[1] (Schur 1911, p. 14, Theorem VII) (note that Schur signed as J. Schur in Journal für die reine und angewandte Mathematik.^[2]^[3])

Proof

Proof using the trace formula

It is easy to show that for matrices $M$ and $N$ , the Hadamard product $M\circ N$ considered as a bilinear form acts on vectors $a,b$ as

a^{T}(M\circ N)b=\operatorname {Tr} (M\operatorname {diag} (a)N\operatorname {diag} (b))

where $\operatorname {Tr}$ is the matrix trace and $\operatorname {diag} (a)$ is the diagonal matrix having as diagonal entries the elements of $a$ .

Since $M$ and $N$ are positive definite, we can consider their square-roots $M^{1/2}$ and $N^{1/2}$ and write

\operatorname {Tr} (M\operatorname {diag} (a)N\operatorname {diag} (b))=\operatorname {Tr} (M^{1/2}M^{1/2}\operatorname {diag} (a)N^{1/2}N^{1/2}\operatorname {diag} (b))=\operatorname {Tr} (M^{1/2}\operatorname {diag} (a)N^{1/2}N^{1/2}\operatorname {diag} (b)M^{1/2})

Then, for $a=b$ , this is written as $\operatorname {Tr} (A^{T}A)$ for $A=N^{1/2}\operatorname {diag} (a)M^{1/2}$ and thus is positive. This shows that $(M\circ N)$ is a positive definite matrix.

Proof using Gaussian integration

Case of M = N

Let $X$ be an $n$ -dimensional centered Gaussian random variable with covariance $\langle X_{i}X_{j}\rangle =M_{ij}$ . Then the covariance matrix of $X_{i}^{2}$ and $X_{j}^{2}$ is

\operatorname {Cov} (X_{i}^{2},X_{j}^{2})=\langle X_{i}^{2}X_{j}^{2}\rangle -\langle X_{i}^{2}\rangle \langle X_{j}^{2}\rangle

Using Wick's theorem to develop $\langle X_{i}^{2}X_{j}^{2}\rangle =2\langle X_{i}X_{j}\rangle ^{2}+\langle X_{i}^{2}\rangle \langle X_{j}^{2}\rangle$ we have

\operatorname {Cov} (X_{i}^{2},X_{j}^{2})=2\langle X_{i}X_{j}\rangle ^{2}=2M_{ij}^{2}

Since a covariance matrix is positive definite, this proves that the matrix with elements $M_{ij}^{2}$ is a positive definite matrix.

General case

Let $X$ and $Y$ be $n$ -dimensional centered Gaussian random variables with covariances $\langle X_{i}X_{j}\rangle =M_{ij}$ , $\langle Y_{i}Y_{j}\rangle =N_{ij}$ and independt from each other so that we have

\langle X_{i}Y_{j}\rangle =0

for any

i,j

Then the covariance matrix of $X_{i}Y_{i}$ and $X_{j}Y_{j}$ is

\operatorname {Cov} (X_{i}Y_{i},X_{j}Y_{j})=\langle X_{i}Y_{i}X_{j}Y_{j}\rangle -\langle X_{i}Y_{i}\rangle \langle X_{j}Y_{j}\rangle

Using Wick's theorem to develop

\langle X_{i}Y_{i}X_{j}Y_{j}\rangle =\langle X_{i}X_{j}\rangle \langle Y_{i}Y_{j}\rangle +\langle X_{i}Y_{i}\rangle \langle X_{i}Y_{j}\rangle +\langle X_{i}Y_{j}\rangle \langle X_{j}Y_{i}\rangle

and also using the independence of $X$ and $Y$ , we have

\operatorname {Cov} (X_{i}Y_{i},X_{j}Y_{j})=\langle X_{i}X_{j}\rangle \langle Y_{i}Y_{j}\rangle =M_{ij}N_{ij}

Since a covariance matrix is positive definite, this proves that the matrix with elements $M_{ij}N_{ij}$ is a positive definite matrix.

Proof using eigendecomposition

Proof of positivity

Let $M=\sum \mu _{i}m_{i}m_{i}^{T}$ and $N=\sum \nu _{i}n_{i}n_{i}^{T}$ . Then

M\circ N=\sum _{ij}\mu _{i}\nu _{i}(m_{i}m_{i}^{T})\circ (n_{i}n_{i}^{T})=\sum _{ij}\mu _{i}\nu _{j}(m_{i}\circ n_{j})(m_{i}\circ n_{j})^{T}

Each $(m_{i}\circ n_{j})(m_{i}\circ n_{j})^{T}$ is positive (but, except in the 1-dimensional case, not positive definite, since they are rank 1 matrices) and $\mu _{i}\nu _{j}>0$ , thus the sum giving $M\circ N$ is also positive.

Complete proof

To show that the result is positive definite requires further proof. We shall show that for any vector $a\neq 0$ , we have $a^{T}(M\circ N)a>0$ . Continuing as above, each $a^{T}(m_{i}\circ n_{j})(m_{i}\circ n_{j})^{T}a\geq 0$ , so it remains to show that there exist $i$ and $j$ for which the inequality is strict. For this we observe that

a^{T}(m_{i}\circ n_{j})(m_{i}\circ n_{j})^{T}a=\left(\sum _{k}m_{i,k}n_{j,k}a_{k}\right)^{2}

Since $N$ is positive definite, there is a $j$ for which $n_{j,k}a_{k}$ is not 0 for all $k$ , and then, since $M$ is positive definite, there is an $i$ for which $m_{i,k}n_{j,k}a_{k}$ is not 0 for all $k$ . Then for this $i$ and $j$ we have $\left(\sum _{k}m_{i,k}n_{j,k}a_{k}\right)^{2}>0$ . This completes the proof.

References

^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1515/crll.1911.140.1, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1515/crll.1911.140.1 instead.
^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1007/b105056, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1007/b105056 instead., page 9, Ch. 0.6 Publication under J. Schur
^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1112/blms/15.2.97, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1112/blms/15.2.97 instead.

External links

Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen at EUDML

[Sch1911-1] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1515/crll.1911.140.1, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1515/crll.1911.140.1 instead.

[2] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1007/b105056, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1007/b105056 instead., page 9, Ch. 0.6 Publication under J. Schur

[3] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1112/blms/15.2.97, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1112/blms/15.2.97 instead.

[1]

[2]

[3]

@@ Line 1: / Line 1: @@
-In [[mathematics]], particularly in [[linear algebra]], the '''Schur product theorem''' states that the [[Hadamard_product_(matrices)|Hadamard product]] of two [[positive definite matrices]] is also a positive definite matrix. The result is named after [[Issai Schur]]<ref name="Sch1911">{{Cite doi|10.1515/crll.1911.140.1}}</ref> (Schur 1911, p. 14, Theorem VII) (note that Schur signed as J. Schur in ''Journal für die reine und angewandte Mathematik''<ref>{{Cite doi|10.1007/b105056}}, page 9, Ch. 0.6 ''Publication under J. Schur''</ref><ref>{{Cite doi|10.1112/blms/15.2.97}}</ref>.)
+In [[mathematics]], particularly in [[linear algebra]], the '''Schur product theorem''' states that the [[Hadamard product (matrices)|Hadamard product]] of two [[positive definite matrices]] is also a positive definite matrix. The result is named after [[Issai Schur]]<ref name="Sch1911">{{Cite doi|10.1515/crll.1911.140.1}}</ref> (Schur 1911, p.&nbsp;14, Theorem VII) (note that Schur signed as J. Schur in ''Journal für die reine und angewandte Mathematik''.<ref>{{Cite doi|10.1007/b105056}}, page 9, Ch. 0.6 ''Publication under J. Schur''</ref><ref>{{Cite doi|10.1112/blms/15.2.97}}</ref>)
 == Proof ==
@@ Line 7: / Line 7: @@
 It is easy to show that for matrices <math>M</math> and <math>N</math>, the Hadamard product <math>M \circ N</math> considered as a bilinear form acts on vectors <math>a, b</math> as
 : <math>a^T (M \circ N) b = \operatorname{Tr}(M \operatorname{diag}(a) N \operatorname{diag}(b))</math>
-where <math>\operatorname{Tr}</math> is the matrix [[Trace_(linear_algebra)|trace]] and <math>\operatorname{diag}(a)</math> is the [[diagonal matrix]] having as diagonal entries the elements of <math>a</math>.
+where <math>\operatorname{Tr}</math> is the matrix [[Trace (linear algebra)|trace]] and <math>\operatorname{diag}(a)</math> is the [[diagonal matrix]] having as diagonal entries the elements of <math>a</math>.
 Since <math>M</math> and <math>N</math> are positive definite, we can consider their square-roots <math>M^{1/2}</math> and <math>N^{1/2}</math> and write
@@ Line 47: / Line 47: @@
 Let <math>M = \sum \mu_i m_i m_i^T</math> and <math>N = \sum \nu_i n_i n_i^T</math>.  Then
 : <math>M \circ N = \sum_{ij} \mu_i \nu_i (m_i m_i^T) \circ (n_i n_i^T) = \sum_{ij} \mu_i \nu_j (m_i \circ n_j) (m_i \circ n_j)^T</math>
-Each <math>(m_i \circ n_j) (m_i \circ n_j)^T</math> is positive (but, except in the 1-dimensional case, not positive definite, since they are [[Rank_(linear_algebra)|rank]] 1 matrices) and <math>\mu_i \nu_j > 0</math>, thus the sum giving <math>M \circ N</math> is also positive.
+Each <math>(m_i \circ n_j) (m_i \circ n_j)^T</math> is positive (but, except in the 1-dimensional case, not positive definite, since they are [[Rank (linear algebra)|rank]] 1 matrices) and <math>\mu_i \nu_j > 0</math>, thus the sum giving <math>M \circ N</math> is also positive.
 ==== Complete proof ====
@@ Line 62: / Line 62: @@
 == External links ==
+* [https://eudml.org/doc/149352 Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen] at [https://eudml.org EUDML]
-[https://eudml.org/doc/149352 Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen] at [https://eudml.org EUDML]
 [[Category:Linear algebra]]