Legendre transformation

In mathematics, two differentiable functions f and g are said to be Legendre transforms of each other if their first derivatives are inverse functions of each other:

Df=\left(Dg\right)^{-1}

f and g are then said to be related by a Legendre transformation. Legendre transformations are named after Adrien-Marie Legendre. They are unique up to an additive constant. A Legendre transformation is its own inverse, and is related to integration by parts.

Applications

Legendre transformations are used in thermodynamics to transform between the different thermodynamic potentials, and in classical mechanics to derive Hamiltonian mechanics from Lagrangian mechanics, as well as the other way around.

Examples

The exponential function e^x has x ln x − x as a Legendre transform since the respective first derivatives e^x and lnx are inverse to each other.

Similarly, the quadratic form

u(x)={\frac {1}{2}}\,x^{t}\,A\,x

with A an invertible n-by-n-matrix has

v(y)={\frac {1}{2}}\,y^{t}\,A^{-1}\,y

as a Legendre transform.

Legendre transformation in one dimension

In one dimension, a Legendre transform to a function f : R → R with an invertible first derivative may be found using the formula

g(y)=y\,x-f(x),\,x=f^{\prime -1}(y)

This can be seen by integrating both sides of the defining condition restricted to one-dimension

f^{\prime }(x)=g^{\prime -1}(x)

from x₀ to x₁, making use of the Fundamental theorem of calculus on the left hand side and substituting

y=g^{\prime -1}(x)

on the right hand side to find

f(x_{1})-f(x_{0})=\int _{y_{0}}^{y_{1}}y\,g^{\prime \prime }(y)\,dy

with g′(y₀) = x₀, g′(y₁) = x₁. Using integration by parts the last integral simplifies to

y_{1}\,g^{\prime }(y_{1})-y_{0}\,g^{\prime }(y_{0})-\int _{y_{0}}^{y_{1}}g^{\prime }(y)\,dy=y_{1}\,x_{1}-y_{0}\,x_{0}-g(y_{1})+g(y_{0})

Therefore,

f(x_{1})+g(y_{1})-y_{1}\,x_{1}=f(x_{0})+g(y_{0})-y_{0}\,x_{0}

Since the left hand side of this equation does only depend on x₁ and the right hand side only on x₀, they have to evaluate to the same constant.

f(x)+g(y)-y\,x=C,\,x=g^{\prime }(y)=f^{\prime -1}(y)

Solving for g and arbitrarily choosing C to be zero we arrive at the above-mentioned formula.

Geometric interpretation

For a convex function f the Legendre-transformation can be interpreted as the mapping between the graph of f and the family of tangents of the graph. (The tangents of f are well-defined at all but at most countably many points since a convex function is differentiable at all but at most countably many points.)

Consider the equation of a line with slope m and y-intercept b:

y=mx+b

For this line to be tangent to the graph of f at the point (x₀, f(x₀)) requires

f\left(x_{0}\right)=mx_{0}+b

and

m=f^{\prime }\left(x_{0}\right)

f′ is strictly monotone as the derivative of a convex function, and the second equation can be solved for x₀, which again allows to eleminate x₀ from the first giving the y-intercept b of the tangent as a function of its slope m:

b=f\left(f^{\prime -1}\left(m\right)\right)-m\cdot f^{\prime -1}\left(m\right)

Legendre transformation in more than one dimension

For a differentiable real-valued function on an open subset U of Rⁿ the Legendre conjugate of the pair (U, f) is defined to be the pair (V, g), where V is the image of U under the gradient mapping Df, and g is the function on V given by the formula

g(y)=\left\langle \left(Df\right)^{-1}(y),y\right\rangle -f\left(\left(Df\right)^{-1}(y)\right)

Convex conjugates

For a function

f:\mathbb {R} ^{n}\rightarrow \mathbb {R} \cup \{+\infty \}

taking values on the extended real number line the Legendre transformation can be generalized to the Legendre-Fenchel transformation or convex conjugate of f by

f^{\star }:\mathbb {R} ^{n}\rightarrow \mathbb {R} \cup \{+\infty \}

f^{\star }\left(p\right)=\sup \left\{\left\langle x,p\right\rangle -f\left(x\right):x\in \mathbb {R} ^{n}\right\}=-\inf \left\{f\left(x\right)-\left\langle x,p\right\rangle :x\in \mathbb {R} ^{n}\right\}

where

\left\langle x,p\right\rangle =\sum _{k=1}^{n}x_{k}\cdot p_{k}

is the scalar product on Rⁿ.

Convex-conjugation is order-reversing: if f ≤ g then f^* ≥ g^*. The convex conjugate of a closed convex function is again a closed convex function. The convex conjugate of a polyhedral convex function (a convex function with polyhedral epigraph) is again a polyhedral convex function. For any proper convex function f and its convex conjugate f^* Fenchel's inequality (also known as the Fenchel-Young inequality) holds:

\left\langle p,x\right\rangle \leq f(x)+f^{\star }(p)

The convex conjugate of a function is always lower semi-continuous. The biconjugate f^** (the convex conjugate of the convex conjugate) is also the closed convex hull, i.e. the largest lower semi-continuous convex function smaller than f. Furthermore, f = f^** iff f is convex and lower semi-continuous.

Further properties

Scaling properties

The Legendre transformation has the following scaling properties:

f(x)=a\cdot g(x)\Rightarrow f^{\star }(p)=a\cdot g^{\star }\left({\frac {p}{a}}\right)

f(x)=g(a\cdot x)\Rightarrow f^{\star }(p)=g^{\star }\left({\frac {p}{a}}\right)

It follows that if a function is homogeneous of degree r then its image under the Legendre transformation is a homogeneous function of degree s, where 1/r + 1/s = 1.

Behavior under translation

f(x)=g(x)+b\Rightarrow f^{\star }(p)=g^{\star }(p)-b

f(x)=g(x+y)\Rightarrow f^{\star }(p)=g^{\star }(p)-p\cdot y

Behavior under inversion

f(x)=g^{-1}(x)\Rightarrow f^{\star }(p)=-p\cdot g^{\star }\left({\frac {1}{p}}\right)

Behavior under linear transformations

Let A be a linear transformation from Rⁿ to R^m. For any convex function f on Rⁿ, one has

\left(Af\right)^{\star }=f^{\star }A^{\star }

where A^* is the adjoint operator of A defined by

\left\langle Ax,y^{\star }\right\rangle =\left\langle x,A^{\star }y^{\star }\right\rangle

A closed convex function f is symmetric with respect to a given set G of orthogonal linear transformations,

f\left(Ax\right)=f(x),\;\forall x,\;\forall A\in G

if and only if f^* is symmetric with respect to G.

Infimal convolution

The infimal convolution of two functions f and g is defined as

\left(f\star _{\inf }g\right)(x)=\inf \left\{f(x-y)+g(y)\,|\,y\in \mathbb {R} ^{n}\right\}

Let f₁, …, f_m be proper convex functions on Rⁿ. Then

\left(f_{1}\star _{\inf }\cdots \star _{\inf }f_{m}\right)^{\star }=f_{1}^{\star }+\cdots +f_{m}^{\star }

References

Rockafellar, Ralph Tyrell, Convex Analysis, pp. 251, Princeton University Press (1996). ISBN 0691015864
Arnol'd, Vladimir Igorevich, Mathematical Methods of Classical Mechanics, second edition, Springer (1989). ISBN 0387968903