In
calculus, the
Chain Rule states; if a
variable,
y, depends on a second variable,
u, which in turn depends on a third variable,
x; then, the rate of
change of
y, with respect to
x, can be
computed as the
product of the rate of change of
y, with respect to
u;
times, the rate of change of
u, with respect to
x. Suppose one is climbing a
mountain, at a rate of 0.5
kilometers per hour. The
temperature is lower at higher elevations; suppose the rate, by which it decreases, is 6° per kilometer. How fast does the temperature drop? Well, if one multiplies 6° per kilometer, by 0.5 kilometers per hour; one obtains 3° per hour. Such calculations are the "heart" of the Chain Rule.
Leibniz would express the Chain Rule as:
- dy/dx = (dy/du) · (du/dx)
The Chain Rule is a formula for the derivative of the composition of two functions. Suppose the real-valued function g(x) is defined on some open subset, of the real numbers, containing the number x; and h[g(x)] is defined on some open subset of the reals containing g(x). If g is differentiable at x and h is differentiable at g(x), then the composition h o g is differentiable at x and the derivative can be computed as
- f '(x) = (h o g)'(x) = h '[g(x)] · g '(x)
Consider
f(
x) = (
x2 + 1)
3.
f(
x) is comparable to
h[
g(
x)] where
g(
x) is (
x2 + 1) and
h(
x) is
x3; thus,
f '(
x) = 3(
x2 + 1)
2(2
x) = 6
x(
x2 + 1)
2.
In order to differentiate the
trigonometric function:
- f(x) = sin(x2)
one can write
f(
x) =
h(
g(
x)) with
h[
f(
x)] = sin(
x2) and
g(
x) =
x2 and the chain rule then yields
- f '(x) = cos(x2) 2x
since
h '[
g(
x)] = cos(
x2) and
g '(
x) = 2
x.
The
General Power Rule (GPR) is derivable, via the Chain Rule.
The chain rule is a fundamental property of all definitions of derivative and is therefore valid in much more general contexts. For instance, if
E,
F and
G are
Banach spaces (which includes
Euclidean space) and
f :
E -> F and
g :
F -> G are functions, and if
x is an element of
E such that
f is differentiable at
x and
g is differentiable at
f(
x), then the derivative of the composition
g o
f at the point
x is given by
- Dx(g o f) = Df(x)(g) o Dx(f)
Note that the derivatives here are
linear maps and not numbers. If the linear maps are represented as
matrices, the composition on the right hand side turns into a matrix multiplication.
A particularly nice formulation of the chain rule can be achieved in the most general setting: let M, N and P be Ck manifolds (or even Banach-manifolds) and let f : M -> N and g : N -> P be differentiable maps. The derivative of f, denoted by df, is then a map from the tangent bundle of M to the tangent bundle of N, and we may write
- d(g o f) = dg o df
In this way, the formation of derivatives and tangent bundles is seen as a
functor on the
category of
C∞ manifolds with
C∞ maps as morphisms.