Differential of a function

In calculus, the differential represents the principal part of the change in a function y = ƒ(x) with respect to changes in the independent variable. It was introduced via an intuitive or heuristic definition by Gottfried Wilhelm Leibniz, who thought of the differential dy as an infinitely small (or infinitesimal) change in the value y of the function, corresponding to an infinitely small change dx in the function's argument x. For that reason, the instantaneous rate of change of y with respect to x, which is the value of the derivative of the function, is denoted by the fraction

{\frac {dy}{dx}}

in what is called the Leibniz notation for derivatives. The quotient dy/dx is not infinitely small; rather it is a real number.

The differential itself is therefore defined by an expression of the form

dy={\frac {dy}{dx}}\,dx

just as if dy and dx represented numbers, and so could be canceled from the numerator and denominator. One also writes

df(x)=f'(x)\,dx.

The precise meaning of such expressions depends on the context of the application and the required level of mathematical rigor. In modern rigorous mathematical treatments, the quantities dy and dx are simply additional real variables that can be manipulated as such. The domain of these variables may take on a particular geometrical significance if the differential is regarded as a particular differential form, or analytical significance if the differential is regarded as a linear approximation to the increment of a function.

Linear approximation

Augustin-Louis Cauchy (1823) defined the differential of a function without appeal to infinitesimals, an approach that has since become standard in modern analytical treatments.^[1] The differential of the function y = ƒ(x) applied to a positive real increment Δx in the independent variable is defined by

dy=df(x,\Delta x)=f'(x)\Delta x\,

although the explicit dependence of the left-hand side on the increment Δx is often suppressed in the notation.

This notion of differential is broadly applicable when a linear approximation to a function is sought, in which the value of the increment Δx is small enough. More precisely, if ƒ is a differentiable function at x, then

\Delta y=f(x+\Delta x)-f(x)=df(x)+\varepsilon \,

where the error ε in the approximation satisfies ε/Δx → 0 as Δx → 0. Since dx(x, Δx) = Δx it is conventional to write dx = Δx, so that the equality

df(x)=f'(x)\,dx

holds. And one has the approximate identity

dy\approx \Delta y

in which the error can be made as small as desired relative to Δx by confining attention to a sufficiently small increment; that is to say,

{\frac {dy-\Delta y}{\Delta x}}\to 0

as Δx → 0.

According to Boyer (1959, p. 12), this approach is a significant logical improvement over the infinitesimal approach of Leibniz because, instead of invoking the metaphysical notion of infinitesimals, the quantities dy and dx are simply new real variables that can be manipulated in exactly the same manner as any other real quantities.

Differentials in several variables

Following Goursat (1904, I, §15), for functions of more than one independent variable,

y=f(x_{1},\dots ,x_{n}),\,

the partial differential of y with respect to any one of the variables x₁ is the principal part of the change in y resulting from a change dx₁ in that one variable. The partial differential is therefore

{\frac {\partial y}{\partial x_{1}}}dx_{1}

involving the partial derivative of y with respect to x₁. The sum of the partial differentials with respect to all of the independent variables is the total differential

dy={\frac {\partial y}{\partial x_{1}}}dx_{1}+\cdots +{\frac {\partial y}{\partial x_{n}}}dx_{n},

which is the principal part of the change in y resulting from changes in the independent variables x_i.

More precisely, in the context of multivariable calculus, following Courant & 1937ii, if ƒ is a differentiable function, then by the definition of the differentiability, the increment

{\begin{aligned}\Delta y&{}{\stackrel {\mathrm {def} }{=}}f(x_{1}+\Delta x_{1},\dots ,x_{n}+\Delta x_{n})-f(x_{1},\dots ,x_{n})\\&{}={\frac {\partial y}{\partial x_{1}}}\Delta x_{1}+\cdots +{\frac {\partial y}{\partial x_{n}}}\Delta x_{n}+\varepsilon _{1}\Delta x_{1}+\cdots +\varepsilon _{n}\Delta x_{n}\end{aligned}}

where the error terms ε_i tend to zero as the increments Δx_i jointly tend to zero. The total differential is then rigorously defined as

dy={\frac {\partial y}{\partial x_{1}}}\Delta x_{1}+\cdots +{\frac {\partial y}{\partial x_{n}}}\Delta x_{n}.

Since, with this definition,

dx_{i}(\Delta x_{1},\dots ,\Delta x_{n})=\Delta x_{i},

one has

dy={\frac {\partial y}{\partial x_{1}}}\,dx_{1}+\cdots +{\frac {\partial y}{\partial x_{n}}}\,dx_{n}.

As in the case of one variable, the approximate identity holds

dy\approx \Delta y

in which the total error can be made as small as desired relative to ${\sqrt {\Delta x_{1}^{2}+\cdots +\Delta x_{n}^{2}}}$ by confining attention to sufficiently small increments.

Higher-order differentials

Higher-order differentials of a function y = ƒ(x) of a single variable x can be defined via (Goursat 1904, I, §14):

d^{2}y=d(dy)=d(f'(x)dx)\,dx=f''(x)\,(dx)^{2},

and, in general,

d^{n}y=f^{(n)}(x)\,(dx)^{n}.

Informally, this justifies Leibniz's notation for higher-order derivatives

f^{(n)}(x)={\frac {d^{n}f}{dx^{n}}}.

Similar considerations apply to defining higher order differentials of functions of several variables.

Properties

A number of properties of the differential follow in a straightforward manner from the corresponding properties of the derivative, partial derivative, and total derivative. These include:^[2]

Linearity: For constants a and b and differentiable functions ƒ and g,

d(af+bg)=a\,df+b\,dg.

Product rule: For two differentiable functions ƒ and g,

d(fg)=f\,dg+g\,df.

An operation d with these two properties is known in abstract algebra as a derivation. In addition, various forms of the chain rule hold, in increasing level of generality:^[3]

If y = ƒ(u) is a differentiable function of the variable u and u = g(x) is a differentiable function of x, then

dy=f'(u)\,du=f'(g(x))g'(x)\,dx.

If y = ƒ(x₁, ..., x_n) and all of the variables x₁, ..., x_n depend on another variable t, then by the chain rule for partial derivatives, one has

{\begin{aligned}dy&={\frac {dy}{dt}}dt\\&={\frac {\partial y}{\partial x_{1}}}dx_{1}+\cdots +{\frac {\partial y}{\partial x_{n}}}dx_{n}\\&={\frac {\partial y}{\partial x_{1}}}{\frac {dx_{1}}{dt}}\,dt+\cdots +{\frac {\partial y}{\partial x_{n}}}{\frac {dx_{n}}{dt}}\,dt.\end{aligned}}

Heuristically, the chain rule for several variables can itself be understood by dividing through both sides of this equation by the infinitely small quantity dt.

More general analogous expressions hold, in which the intermediate variables x_i depend on more than one variable.

Other approaches

Although the notion of having an infinitesimal increment dx is not well-defined in modern mathematical analysis, a variety of techniques exist for defining the infinitesimal differential so that the differential of a function can be handled in a manner that does not clash with the Leibniz notation. These include:

Defining the differential as a kind of differential form, specifically the exterior derivative of a function. The infinitesimal increments are then identified with vectors in the tangent space at a point. This approach is popular in differential geometry and related fields, because it readily generalizes to mappings between differentiable manifolds.
Differentials as nilpotent elements of commutative rings. This approach is popular in algebraic geometry.^[4]
Differentials in smooth models of set theory. This approach is known as synthetic differential geometry or smooth infinitesimal analysis and is closely related to the algebraic geometric approach, except that ideas from topos theory are used to hide the mechanisms by which nilpotent infinitesimals are introduced.^[5]
Differentials as infinitesimals in hyperreal number systems, which are extensions of the real numbers which contain invertible infinitesimals and infinitely large numbers. This is the approach of nonstandard analysis pioneered by Abraham Robinson.^[6]

Examples and applications

Differentials may be effectively used in numerical analysis to study the propagation of experimental errors in a calculation, and thus the overall numerical stability of a problem (Courant 1937i) harv error: multiple targets (2×): CITEREFCourant1937i (help). Suppose that the variable x represents the outcome of an experiment and y is the result of a numerical computation applied to x. The question is to what extent errors in the measurement of x influence the outcome of the computation of y. If the x is known to within Δx of its true value, then Taylor's theorem gives the following estimate on the error Δy in the computation of y:

\Delta y=f'(x)\Delta x+{\frac {(\Delta x)^{2}}{2}}f''(\xi )

where ξ = x + θΔx for some 0 < θ < 1. If Δx is small, then the second order term is negligible, so that Δy is, for practical purposes, well-approximated by dy = ƒ'(x)Δx.

The differential is often useful to rewrite a differential equation

{\frac {dy}{dx}}=g(x)

in the form

dy=g(x)\,dx,

in particular when one wants to separate the variables.

Notes

^ See, for instance, Courant 1937i harvnb error: multiple targets (2×): CITEREFCourant1937i (help), Kline 1977, or Goursat 1904. For a detailed historical account, see Boyer 1959, especially page 275 for Cauchy's contribution on the subject. An abbreviated account appears in Kline 1972, Chapter 40.
^ Goursat 1904, I, §17
^ Goursat 1904, I, §§14,16
^ Eisenbud & Harris 1998.
^ See Kock 2006 and Moerdijk & Reyes 1991.
^ See Robinson 1996 and Keisler 1986.

References

Boyer, Carl B. (1959), The history of the calculus and its conceptual development, New York: Dover Publications, MR0124178.
Cauchy, Augustin-Louis (1823), "Quatrièm leçon: Différentialles des fonctions d'une seule variable", Résumé des Leçons données à l'Ecole royale polytechnique sur les applications du calcul infinitésimal, pp. 27–31.
Courant, Richard (1937i), Differential and integral calculus. Vol. I, Wiley Classics Library, New York: John Wiley & Sons (published 1988), ISBN 978-0-471-60842-4, MR1009558.
Courant, Richard (1937ii), Differential and integral calculus. Vol. II, Wiley Classics Library, New York: John Wiley & Sons (published 1988), ISBN 978-0-471-60840-0, MR1009559 {{citation}}: Check date values in: |year= (help)CS1 maint: year (link).
Courant, Richard; John, Fritz (1999), Introduction to Calculus and Analysis Volume 1, Classics in Mathematics, Berlin, New York: Springer-Verlag, ISBN 3-540-65058-X, MR1746554
Eisenbud, David; Harris, Joe (1998), The Geometry of Schemes, Springer-Verlag, ISBN 0-387-98637-5.
Goursat, Édouard (1904), A course in mathematical analysis: Vol 1: Derivatives and differentials, definite integrals, expansion in series, applications to geometry, translated by E. R. Hedrick, New York: Dover Publications (published 1959), MR0106155.
Kline, Morris (1977), "Chapter 13: Differentials and the law of the mean", Calculus: An intuitive and physical approach, John Wiley and Sons.
Kline, Morris (1972), Mathematical thought from ancient to modern times (3rd ed.), Oxford University Press (published 1990), ISBN 978-0-19-506136-9
Keisler, H. Jerome (1986), [[Elementary calculus]]: An Approach Using Infinitesimals (2nd ed.) {{citation}}: URL–wikilink conflict (help).
Kock, Anders (2006), Synthetic Differential Geometry (PDF) (2nd ed.), Cambridge University Press.
Moerdijk, I.; Reyes, G.E. (1991), Models for Smooth Infinitesimal Analysis, Springer-Verlag.
Robinson, Abraham (1996), Non-standard analysis, Princeton University Press, ISBN 978-0-691-04490-3.
Tolstov, G.P. (2001) [1994], "Differential", Encyclopedia of Mathematics, EMS Press.

[1] See, for instance, Courant 1937i harvnb error: multiple targets (2×): CITEREFCourant1937i (help), Kline 1977, or Goursat 1904. For a detailed historical account, see Boyer 1959, especially page 275 for Cauchy's contribution on the subject. An abbreviated account appears in Kline 1972, Chapter 40.

[2] Goursat 1904, I, §17

[3] Goursat 1904, I, §§14,16

[4] Eisenbud & Harris 1998.

[5] See Kock 2006 and Moerdijk & Reyes 1991.

[nonstd-6] See Robinson 1996 and Keisler 1986.

[1]

[2]

[3]

[4]

[5]

[6]