<<Up     Contents

Chain rule

In calculus, the Chain Rule states; if a variable, y, depends on a second variable, u, which in turn depends on a third variable, x; then, the rate of change of y, with respect to x, can be computed as the product of the rate of change of y, with respect to u; times, the rate of change of u, with respect to x. Suppose one is climbing a mountain, at a rate of 0.5 kilometers per hour. The temperature is lower at higher elevations; suppose the rate, by which it decreases, is 6° per kilometer. How fast does the temperature drop? Well, if one multiplies 6° per kilometer, by 0.5 kilometers per hour; one obtains 3° per hour. Such calculations are the "heart" of the Chain Rule.

Leibniz would express the Chain Rule as:

dy/dx = (dy/du) · (du/dx)

The Chain Rule is a formula for the derivative of the composition of two functions. Suppose the real-valued function g(x) is defined on some open subset, of the real numbers, containing the number x; and h[g(x)] is defined on some open subset of the reals containing g(x). If g is differentiable at x and h is differentiable at g(x), then the composition h o g is differentiable at x and the derivative can be computed as

f '(x) = (h o g)'(x) = h '[g(x)] · g '(x)

Table of contents

Example I

Consider f(x) = (x2 + 1)3. f(x) is comparable to h[g(x)] where g(x) is (x2 + 1) and h(x) is x3; thus, f '(x) = 3(x2 + 1)2(2x) = 6x(x2 + 1)2.

Example II

In order to differentiate the trigonometric function:
f(x) = sin(x2)
one can write f(x) = h(g(x)) with h[f(x)] = sin(x2) and g(x) = x2 and the chain rule then yields
f '(x) = cos(x2) 2x
since h '[g(x)] = cos(x2) and g '(x) = 2x.

The General Power Rule

The General Power Rule (GPR) is derivable, via the Chain Rule.

The Fundamental Chain Rule

The chain rule is a fundamental property of all definitions of derivative and is therefore valid in much more general contexts. For instance, if E, F and G are Banach spaces (which includes Euclidean space) and f : E -> F and g : F -> G are functions, and if x is an element of E such that f is differentiable at x and g is differentiable at f(x), then the derivative of the composition g o f at the point x is given by
Dx(g o f) = Df(x)(g) o Dx(f)
Note that the derivatives here are linear maps and not numbers. If the linear maps are represented as matrices, the composition on the right hand side turns into a matrix multiplication.

A particularly nice formulation of the chain rule can be achieved in the most general setting: let M, N and P be Ck manifolds (or even Banach-manifolds) and let f : M -> N and g : N -> P be differentiable maps. The derivative of f, denoted by df, is then a map from the tangent bundle of M to the tangent bundle of N, and we may write

d(g o f) = dg o df
In this way, the formation of derivatives and tangent bundles is seen as a functor on the category of C manifolds with C maps as morphisms.

wikipedia.org dumped 2003-03-17 with terodump