Byers–Yang theorem: Difference between revisions
m Brienanni moved page Byers-yang theorem to Byers-Yang theorem: capitalization |
#suggestededit-add 1.0 Tags: Mobile edit Mobile app edit Android app edit |
||
(23 intermediate revisions by 13 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description|Theorem in quantum mechanics}} |
|||
{{csb-pageincludes|1=http://www.scientificlib.com/en/Physics/TheoreticalPhysics/HellmannFeynmanTheorem.html}} |
|||
In [[quantum mechanics]], the '''Byers–Yang theorem''' states that all physical properties of a doubly connected system (an annulus) enclosing a magnetic flux <math>\Phi</math> through the opening are periodic in the flux with period <math>\Phi_0=hc/e</math> (the [[magnetic flux quantum]]). The theorem was first stated and proven by [[Nina Byers]] and [[Chen-Ning Yang]] (1961),<ref>{{cite journal |last1 = Byers |first1 = N. |author1-link = Nina Byers |last2 = Yang |first2 = C. N. |author2-link = Yang Chen-Ning |title=Theoretical Considerations Concerning Quantized Magnetic Flux in Superconducting Cylinders |journal=[[Physical Review Letters]] |year=1961|volume=7|issue=2|pages=46–49 |doi = 10.1103/PhysRevLett.7.46 |bibcode = 1961PhRvL...7...46B }}</ref> and further developed by [[Felix Bloch]] (1970).<ref>{{cite journal |last=Bloch|first=F.|year=1970|title=Josephson Effect in a Superconducting Ring |journal=[[Physical Review B]]|volume=2|issue=1 |pages=109–121|doi=10.1103/PhysRevB.2.109 |bibcode = 1970PhRvB...2..109B }}</ref> |
|||
In [[quantum mechanics]], the '''Byers-Yang theorem''' states that all physical properties of a doubly-connected system (a ring) enclosing a magnetic flux Φ through the opening are periodic in the flux with period <math>\Phi_0=h/e$ (the socalled [[flux quantum]]). The theorem was first stated and proven by [[Byers]] and [[Chen-Ning Yang]] (1961) <ref>{{cite journal|last=Feynman|first=R. P.|year=1939|title=Forces in Molecules|journal=Phys. Rev.|volume=56|issue=4|pages=340|doi=10.1103/PhysRev.56.340 |bibcode = 1939PhRv...56..340F }}</ref>, and later by [[Felix Bloch]] <ref>{{cite journal|last=Feynman|first=R. P.|year=1939|title=Forces in Molecules|journal=Phys. Rev.|volume=56|issue=4|pages=340|doi=10.1103/PhysRev.56.340 |bibcode = 1939PhRv...56..340F }}</ref>. |
|||
relates the derivative of the total energy with respect to a parameter, to the [[Expectation value (quantum mechanics)|expectation value]] of the derivative of the [[Hamiltonian (quantum mechanics)|Hamiltonian]] with respect to that same parameter. According to the theorem, once the spatial distribution of the electrons has been determined by solving the [[Schrödinger equation]], all the forces in the system can be calculated using [[Classical electromagnetism|classical electrostatics]]. |
|||
==Proof== |
==Proof== |
||
An enclosed flux <math>\Phi</math> corresponds to a vector potential <math>A(r)</math> inside the annulus with a line integral <math display="inline">\oint_C A\cdot dl=\Phi</math> along any path <math>C</math> that circulates around once. One can try to eliminate this vector potential by the [[gauge transformation]] |
|||
: <math>\psi'(\{r_n\})=\exp\left(\frac{ie}{\hbar}\sum_j\chi(r_j)\right)\psi(\{r_n\})</math> |
|||
of the [[wave function]] <math>\psi(\{r_n\})</math> of electrons at positions <math>r_1,r_2,\ldots</math>. The gauge-transformed wave function satisfies the same [[Schrödinger equation]] as the original wave function, but with a different [[magnetic vector potential]] <math>A'(r)=A(r)+\nabla\chi(r)</math>. It is assumed that the electrons experience zero magnetic field <math>B(r)=\nabla\times A(r)=0</math> at all points <math>r</math> inside the annulus, the field being nonzero only within the opening (where there are no electrons). It is then always possible to find a function <math>\chi(r)</math> such that <math>A'(r)=0</math> inside the annulus, so one would conclude that the system with enclosed flux <math>\Phi</math> is equivalent to a system with zero enclosed flux. |
|||
However, for any arbitrary <math>\Phi</math> the gauge transformed wave function is no longer single-valued: The phase of <math>\psi'</math> changes by |
|||
The flux Φ can be eliminated by a gauge transformation |
|||
: <math>\delta\phi=(e/\hbar)\oint_C\nabla\chi(r)\cdot dl=-(e/\hbar)\oint_C A(r)\cdot dl=-2\pi\Phi/\Phi_0</math> |
|||
whenever one of the coordinates <math>r_n</math> is moved along the ring to its starting point. The requirement of a single-valued wave function therefore restricts the gauge transformation to fluxes <math>\Phi</math> that are an integer multiple of <math>\Phi_0</math>. Systems that enclose a flux differing by a multiple of <math>h/e</math> are equivalent. |
|||
This proof of the Hellmann–Feynman theorem requires that the wavefunction be an eigenfunction of the Hamiltonian under consideration; however, one can also prove more generally that the theorem holds for non-eigenfunction wavefunctions which are stationary (partial derivative is zero) for all relevant variables (such as orbital rotations). The [[Hartree–Fock]] wavefunction is an important example of an approximate eigenfunction that still satisfies the Hellmann–Feynman theorem. Notable example of where the Hellmann–Feynman is not applicable is for example finite-order [[Møller–Plesset perturbation theory]], which is not variational.<ref>{{cite book|last=Jensen|first=Frank|title=Introduction to Computational Chemistry|publisher=John Wiley & Sons|location=West Sussex|year=2007|isbn=0-470-01186-6|page=322}}</ref> |
|||
The proof also employs an identity of normalized wavefunctions – that derivatives of the overlap of a wavefunction with itself must be zero. Using Dirac's [[bra–ket notation]] these two conditions are written as |
|||
:<math>\hat{H}_{\lambda}|\psi_\lambda\rangle = E_{\lambda}|\psi_\lambda\rangle,</math> |
|||
:<math>\langle\psi_\lambda|\psi_\lambda\rangle = 1 \Rightarrow \frac{\mathrm{d}}{\mathrm{d}\lambda}\langle\psi_\lambda|\psi_\lambda\rangle =0.</math> |
|||
The proof then follows through an application of the derivative [[product rule]] to the [[Expectation value (quantum mechanics)|expectation value]] of the Hamiltonian viewed as a function of λ: |
|||
:<math> |
|||
\begin{align} |
|||
\frac{\mathrm{d} E_{\lambda}}{\mathrm{d}\lambda} &= \frac{\mathrm{d}}{\mathrm{d}\lambda}\langle\psi_\lambda|\hat{H}_{\lambda}|\psi_\lambda\rangle \\ |
|||
&=\bigg\langle\frac{\mathrm{d}\psi_\lambda}{\mathrm{d}\lambda}\bigg|\hat{H}_{\lambda}\bigg|\psi_\lambda\bigg\rangle + \bigg\langle\psi_\lambda\bigg|\hat{H}_{\lambda}\bigg|\frac{\mathrm{d}\psi_\lambda}{\mathrm{d}\lambda}\bigg\rangle + \bigg\langle\psi_\lambda\bigg|\frac{\mathrm{d}\hat{H}_{\lambda}}{\mathrm{d}\lambda}\bigg|\psi_\lambda\bigg\rangle \\ |
|||
&=E_{\lambda}\bigg\langle\frac{\mathrm{d}\psi_\lambda}{\mathrm{d}\lambda}\bigg|\psi_\lambda\bigg\rangle + E_{\lambda}\bigg\langle\psi_\lambda\bigg|\frac{\mathrm{d}\psi_\lambda}{\mathrm{d}\lambda}\bigg\rangle + \bigg\langle\psi_\lambda\bigg|\frac{\mathrm{d}\hat{H}_{\lambda}}{\mathrm{d}\lambda}\bigg|\psi_\lambda\bigg\rangle \\ |
|||
&=E_{\lambda}\frac{\mathrm{d}}{\mathrm{d}\lambda}\langle\psi_\lambda\bigg|\psi_\lambda\rangle + \bigg\langle\psi_\lambda\bigg|\frac{\mathrm{d}\hat{H}_{\lambda}}{\mathrm{d}\lambda}\bigg|\psi_\lambda\bigg\rangle \\ |
|||
&=\bigg\langle\psi_\lambda\bigg|\frac{\mathrm{d}\hat{H}_{\lambda}}{\mathrm{d}\lambda}\bigg|\psi_\lambda\bigg\rangle. |
|||
\end{align} |
|||
</math> |
|||
For a deep critical view of the proof see<ref>{{cite journal|last=Carfì|first=David|year=2010|title=The pointwise Hellmann–Feynman theorem|journal=AAPP Physical, Mathematical, and Natural Sciences|volume=88|issue=1|at=no. C1A1001004|doi=10.1478/C1A1001004|issn=1825–1242 }}</ref> |
|||
==Alternate proof== |
|||
The Hellmann–Feynman theorem is actually a direct, and to some extent trivial, consequence of the variational principle (the [[Rayleigh–Ritz method|Rayleigh-Ritz variational principle]]) from which the Schrödinger equation can be made to derive. This is why the Hellmann–Feynman theorem holds for wave-functions (such as the Hartree–Fock wave-function) that, though not eigenfunctions of the Hamiltonian, do derive from a variational principle. This is also why it holds, e.g., in [[density functional theory]], which is not wave-function based and for which the standard derivation does not apply. |
|||
According to the Rayleigh–Ritz variational principle, the eigenfunctions of the Schrödinger equation are stationary points of the functional (which we nickname ''Schrödinger functional'' for brevity): |
|||
{{NumBlk|:|<math>E[\psi,\lambda]=\frac{\langle\psi|\hat{H}_{\lambda}|\psi\rangle}{\langle\psi|\psi\rangle}.</math>|{{EquationRef|2}}}} |
|||
The eigenvalues are the values that the Schrödinger functional takes at the stationary points: |
|||
{{NumBlk|:|<math>E_{\lambda}=E[\psi_{\lambda},\lambda],</math>|{{EquationRef|3}}}} |
|||
where <math>\psi_{\lambda} </math> satisfies the variational condition: |
|||
{{NumBlk|:|<math>\left.\frac{\delta E[\psi,\lambda]}{\delta\psi(x)}\right|_{\psi=\psi_{\lambda}}=0.</math>|{{EquationRef|4}}}} |
|||
Let us differentiate Eq. (3) using the [[chain rule]]: |
|||
{{NumBlk|:|<math> \frac{dE_{\lambda}}{d\lambda}=\frac{\partial E[\psi_{\lambda},\lambda]}{\partial\lambda}+\int\frac{\delta E[\psi,\lambda]}{\delta\psi(x)}\frac{d\psi_{\lambda}(x)}{d\lambda}dx. </math>|{{EquationRef|5}}}} |
|||
Due to the variational condition, Eq. (4), the second term in Eq. (5) vanishes. In one sentence, the Hellmann–Feynman theorem states that ''the derivative of the stationary values of a function(al) with respect to a parameter on which it may depend, can be computed from the explicit dependence only, disregarding the implicit one''. On account of the fact that the Schrödinger functional can only depend explicitly on an external parameter through the Hamiltonian, Eq. (1) trivially follows. As simple as that. |
|||
==Example applications== |
|||
===Molecular forces=== |
|||
The most common application of the Hellmann–Feynman theorem is to the calculation of [[intramolecular]] forces in molecules. This allows for the calculation of [[molecular geometry|equilibrium geometries]] – the nuclear coordinates where the forces acting upon the nuclei, due to the electrons and other nuclei, vanish. The parameter λ corresponds to the coordinates of the nuclei. For a molecule with 1 ≤ ''i'' ≤ ''N'' electrons with coordinates {'''r'''<sub>''i''</sub>}, and 1 ≤ α ≤ ''M'' nuclei, each located at a specified point {'''R'''<sub>α</sub>={''X''<sub>α</sub>,''Y''<sub>α</sub>,''Z''<sub>α</sub>)} and with nuclear charge ''Z''<sub>α</sub>, the [[molecular Hamiltonian|clamped nucleus Hamiltonian]] is |
|||
:<math>\hat{H}=\hat{T} + \hat{U} - \sum_{i=1}^{N}\sum_{\alpha=1}^{M}\frac{Z_{\alpha}}{|\mathbf{r}_{i}-\mathbf{R}_{\alpha}|} + \sum_{\alpha}^{M}\sum_{\beta>\alpha}^{M}\frac{Z_{\alpha}Z_{\beta}}{|\mathbf{R}_{\alpha}-\mathbf{R}_{\beta}|}.</math> |
|||
The force acting on the x-component of a given nucleus is equal to the negative of the derivative of the total energy with respect to that coordinate. Employing the Hellmann–Feynman theorem this is equal to |
|||
:<math>F_{X_{\gamma}} = -\frac{\partial E}{\partial X_{\gamma}} = -\bigg\langle\psi\bigg|\frac{\partial\hat{H}}{\partial X_{\gamma}}\bigg|\psi\bigg\rangle.</math> |
|||
Only two components of the Hamiltonian contribute to the required derivative – the electron-nucleus and nucleus-nucleus terms. Differentiating the Hamiltonian yields<ref name="piela">{{cite book|last=Piela|first= Lucjan|title=Ideas of Quantum Chemistry|publisher=Elsevier Science|location=Amsterdam|year=2006|page=620|isbn=0-444-52227-1}}</ref> |
|||
:<math> |
|||
\begin{align} |
|||
\frac{\partial\hat{H}}{\partial X_{\gamma}} &= \frac{\partial}{\partial X_{\gamma}} \left(- \sum_{i=1}^{N}\sum_{\alpha=1}^{M}\frac{Z_{\alpha}}{|\mathbf{r}_{i}-\mathbf{R}_{\alpha}|} + \sum_{\alpha}^{M}\sum_{\beta>\alpha}^{M}\frac{Z_{\alpha}Z_{\beta}}{|\mathbf{R}_{\alpha}-\mathbf{R}_{\beta}|}\right), \\ |
|||
&=Z_{\gamma}\sum_{i=1}^{N}\frac{x_{i}-X_{\gamma}}{|\mathbf{r}_{i}-\mathbf{R}_{\gamma}|^{3}} |
|||
-Z_{\gamma}\sum_{\alpha\neq\gamma}^{M}Z_{\alpha}\frac{X_{\alpha}-X_{\gamma}}{|\mathbf{R}_{\alpha}-\mathbf{R}_{\gamma}|^{3}}. |
|||
\end{align} |
|||
</math> |
|||
Insertion of this in to the Hellmann–Feynman theorem returns the force on the x-component of the given nucleus in terms of the [[electronic density]] (''ρ''('''r''')) and the atomic coordinates and nuclear charges: |
|||
:<math>F_{X_{\gamma}} = -Z_{\gamma}\left(\int\mathrm{d}\mathbf{r}\ \rho(\mathbf{r})\frac{x-X_{\gamma}}{|\mathbf{r}-\mathbf{R}_{\gamma}|^{3}} - \sum_{\alpha\neq\gamma}^{M}Z_{\alpha}\frac{X_{\alpha}-X_{\gamma}}{|\mathbf{R}_{\alpha}-\mathbf{R}_{\gamma}|^{3}}\right).</math> |
|||
===Expectation values=== |
|||
An alternative approach for applying the Hellmann–Feynman theorem is to promote a fixed or discrete parameter which appears in a Hamiltonian to be a continuous variable solely for the mathematical purpose of taking a derivative. Possible parameters are physical constants or discrete quantum numbers. As an example, the [[Hydrogen-like atom|radial Schrödinger equation for a hydrogen-like atom]] is |
|||
:<math>\hat{H}_{l}=-\frac{\hbar^{2}}{2\mu r^2}\left(\frac{\mathrm{d}}{\mathrm{d}r}\left(r^{2}\frac{\mathrm{d}}{\mathrm{d}r}\right)-l(l+1)\right) -\frac{Ze^{2}}{r},</math> |
|||
which depends upon the discrete [[azimuthal quantum number]] ''l''. Promoting ''l'' to be a continuous parameter allows for the derivative of the Hamiltonian to be taken: |
|||
:<math>\frac{\partial \hat{H}_{l}}{\partial l} = \frac{\hbar^{2}}{2\mu r^{2}}(2l+1).</math> |
|||
The Hellmann–Feynman theorem then allows for the determination of the expectation value of <math>\frac{1}{r^{2}}</math> for hydrogen-like atoms:<ref>{{cite book|last=Fitts|first=Donald D.|title=Principles of Quantum Mechanics : as Applied to Chemistry and Chemical Physics|publisher=Cambridge University Press|location=Cambridge|year=2002|isbn=0-521-65124-7|page=186}}</ref> |
|||
:<math> |
|||
\begin{align} |
|||
\bigg\langle\psi_{nl}\bigg|\frac{1}{r^{2}}\bigg|\psi_{nl}\bigg\rangle &= \frac{2\mu}{\hbar^{2}}\frac{1}{2l+1}\bigg\langle\psi_{nl}\bigg|\frac{\partial \hat{H}_{l}}{\partial l}\bigg|\psi_{nl}\bigg\rangle \\ |
|||
&=\frac{2\mu}{\hbar^{2}}\frac{1}{2l+1}\frac{\partial E_{n}}{\partial l} \\ |
|||
&=\frac{2\mu}{\hbar^{2}}\frac{1}{2l+1}\frac{\partial E_{n}}{\partial n}\frac{\partial n}{\partial l} \\ |
|||
&=\frac{2\mu}{\hbar^{2}}\frac{1}{2l+1}\frac{Z^{2}\mu e^{4}}{\hbar^{2}n^{3}} \\ |
|||
&=\frac{Z^{2}\mu^{2}e^{4}}{\hbar^{4}n^{3}(l+1/2)}. |
|||
\end{align} |
|||
</math> |
|||
===Van der Waals forces=== |
|||
In the end of Feynman's paper, he states that, "[[Van der Waals force|Van der Waals's forces]] can also be interpreted as arising from charge distributions |
|||
with higher concentration between the nuclei. The Schrödinger perturbation theory for two interacting atoms at a separation ''R'', large compared to the radii of the atoms, leads to the result that the charge distribution of each is distorted from central |
|||
symmetry, a dipole moment of order 1/''R''<sup>7</sup> being induced in each atom. The negative charge distribution of each atom has its center of gravity moved slightly toward the other. It is not the interaction of these dipoles which leads to van der Waals's force, but rather the attraction of each nucleus for the distorted charge distribution of its ''own'' electrons that gives the attractive 1/''R''<sup>7</sup> force." |
|||
==Hellmann–Feynman theorem for time-dependent wavefunctions== |
|||
For a general time-dependent wavefunction satisfying the time-dependent [[Schrödinger equation]], the Hellmann–Feynman theorem is '''not''' valid. |
|||
However, the following identity holds: |
|||
:<math> |
|||
\bigg\langle\Psi_\lambda(t)\bigg|\frac{\partial H_\lambda}{\partial\lambda}\bigg|\Psi_\lambda(t)\bigg\rangle = i \hbar \frac{\partial}{\partial t}\bigg\langle\Psi_\lambda(t)\bigg|\frac{\partial \Psi_\lambda(t)}{\partial \lambda}\bigg\rangle |
|||
</math> |
|||
For |
|||
:<math> |
|||
i\hbar\frac{\partial\Psi_\lambda(t)}{\partial t}=H_\lambda\Psi_\lambda(t) |
|||
</math> |
|||
===Proof=== |
|||
The proof only relies on the Schrödinger equation and the assumption that partial derivatives with respect to λ and t can be interchanged. |
|||
==Applications== |
|||
:<math> |
|||
An overview of physical effects governed by the Byers–Yang theorem is given by [[Yoseph Imry]].<ref>{{cite book|last=Imry|first=Y.|title=Introduction to Mesoscopic Physics|publisher=Oxford University Press|year=1997|isbn=0-19-510167-7 }}</ref> These include the |
|||
\begin{align} |
|||
[[Aharonov–Bohm effect]], [[persistent current]] in normal metals, and [[flux quantization]] in superconductors. |
|||
\bigg\langle\Psi_\lambda(t)\bigg|\frac{\partial H_\lambda}{\partial\lambda}\bigg|\Psi_\lambda(t)\bigg\rangle &= |
|||
\frac{\partial}{\partial\lambda}\langle\Psi_\lambda(t)|H_\lambda|\Psi_\lambda(t)\rangle |
|||
- \bigg\langle\frac{\partial\Psi_\lambda(t)}{\partial\lambda}\bigg|H_\lambda\bigg|\Psi_\lambda(t)\bigg\rangle |
|||
- \bigg\langle\Psi_\lambda(t)\bigg|H_\lambda\bigg|\frac{\partial\Psi_\lambda(t)}{\partial\lambda}\bigg\rangle \\ |
|||
&= i\hbar \frac{\partial}{\partial\lambda}\bigg\langle\Psi_\lambda(t)\bigg|\frac{\partial\Psi_\lambda(t)}{\partial t}\bigg\rangle |
|||
- i\hbar\bigg\langle\frac{\partial\Psi_\lambda(t)}{\partial\lambda}\bigg|\frac{\partial\Psi_\lambda(t)}{\partial t}\bigg\rangle |
|||
+ i\hbar\bigg\langle\frac{\partial\Psi_\lambda(t)}{\partial t}\bigg|\frac{\partial\Psi_\lambda(t)}{\partial\lambda}\bigg\rangle \\ |
|||
&= i\hbar \bigg\langle\Psi_\lambda(t)\bigg| \frac{\partial^2\Psi_\lambda(t)}{\partial\lambda \partial t}\bigg\rangle |
|||
+ i\hbar\bigg\langle\frac{\partial\Psi_\lambda(t)}{\partial t}\bigg|\frac{\partial\Psi_\lambda(t)}{\partial\lambda}\bigg\rangle \\ |
|||
&= i \hbar \frac{\partial}{\partial t}\bigg\langle\Psi_\lambda(t)\bigg|\frac{\partial \Psi_\lambda(t)}{\partial \lambda}\bigg\rangle |
|||
\end{align} |
|||
</math> |
|||
== |
==References== |
||
{{ |
{{Reflist}} |
||
{{DEFAULTSORT: |
{{DEFAULTSORT:Byers-Yang theorem}} |
||
[[Category: |
[[Category:Theorems in quantum mechanics]] |
||
[[Category:Intermolecular forces]] |
|||
[[Category:Theorems in quantum physics]] |
|||
[[Category:Richard Feynman]] |
Latest revision as of 05:04, 10 October 2023
In quantum mechanics, the Byers–Yang theorem states that all physical properties of a doubly connected system (an annulus) enclosing a magnetic flux through the opening are periodic in the flux with period (the magnetic flux quantum). The theorem was first stated and proven by Nina Byers and Chen-Ning Yang (1961),[1] and further developed by Felix Bloch (1970).[2]
Proof
[edit]An enclosed flux corresponds to a vector potential inside the annulus with a line integral along any path that circulates around once. One can try to eliminate this vector potential by the gauge transformation
of the wave function of electrons at positions . The gauge-transformed wave function satisfies the same Schrödinger equation as the original wave function, but with a different magnetic vector potential . It is assumed that the electrons experience zero magnetic field at all points inside the annulus, the field being nonzero only within the opening (where there are no electrons). It is then always possible to find a function such that inside the annulus, so one would conclude that the system with enclosed flux is equivalent to a system with zero enclosed flux.
However, for any arbitrary the gauge transformed wave function is no longer single-valued: The phase of changes by
whenever one of the coordinates is moved along the ring to its starting point. The requirement of a single-valued wave function therefore restricts the gauge transformation to fluxes that are an integer multiple of . Systems that enclose a flux differing by a multiple of are equivalent.
Applications
[edit]An overview of physical effects governed by the Byers–Yang theorem is given by Yoseph Imry.[3] These include the Aharonov–Bohm effect, persistent current in normal metals, and flux quantization in superconductors.
References
[edit]- ^ Byers, N.; Yang, C. N. (1961). "Theoretical Considerations Concerning Quantized Magnetic Flux in Superconducting Cylinders". Physical Review Letters. 7 (2): 46–49. Bibcode:1961PhRvL...7...46B. doi:10.1103/PhysRevLett.7.46.
- ^ Bloch, F. (1970). "Josephson Effect in a Superconducting Ring". Physical Review B. 2 (1): 109–121. Bibcode:1970PhRvB...2..109B. doi:10.1103/PhysRevB.2.109.
- ^ Imry, Y. (1997). Introduction to Mesoscopic Physics. Oxford University Press. ISBN 0-19-510167-7.