6.1 Derivatives of Most Useful Functions

Rational functions are an important and useful class of functions, but there are others. We actually get most useful functions by starting with two additional functions beyond the identity function, and allowing two more operations in addition to addition subtraction multiplication and division.

What additional starting functions?

The two are the exponential function, which we will write for the moment as \(\exp(x)\), and the sine function, which is generally written as \(\sin(x)\).

And what are these?

We will devote some time and effort to introducing and describing these two functions and their many wonderful properties very soon. For now, all we care about is that they exist, you can find them on spreadsheets and scientific calculators, and we can perform arithmetic operations (addition, subtraction, multiplication and division) on them. If you want just a hint, the sine function is the basic function of the study of angles, which is called trigonometry. The exponent function is defined in terms of derivatives. It is the function whose value at argument 0 is 1, that has derivative everywhere that is the same as itself. We have

\[\frac{d \, \exp(x)}{dx} = \exp(x) \,\text{or,}\, (\exp(x))' = \exp(x)\]

This definition may make the function a bit mysterious to you at first, but you have to admit that it makes it easy to differentiate this function.

This exponential function has an important and interesting property: Namely,

\[\exp(x+y) = \exp(x)\exp(y)\]

(Idea of proof) As a function of \(x\), by the statements below about derivatives of substitutions, we can deduce that the derivative of \(\exp(x+y)\) is itself. Its value at \(x = 0\) is \(\exp(y)\). This derivative differs from \(\exp(x)\) only by having the value \(\exp(y)\) when \(x\) is \(0\) rather than the value \(1\). This means that the derivative of \(\exp(x+y)\) which is \(\exp(x+y)\) itself, is \(\exp(x)\) multiplied by \(\exp(y)\). We have ignored any possible dependence of \(y\) on \(x\). Doing so only means we were computing what is called the "partial derivative with respect to the variable x keeping the variable y fixed". Do not worry about this; it is one of the ways we handle calculus when there is more than one variable)

The defining properties of \(\exp(x)\) allow us to deduce a power series representation of it. \(\exp(x)\) has a constant term \(1\), and being its own derivative must have a linear term whose derivative is that \(1\), namely \(x\). Likewise it must have a quadratic term whose derivative is \(x\), namely, \(\frac{x^2}{2}\). Continuing this deduction forever gives us

\[\exp(x) = 1 + x + \frac{x^2}{2} + ... + \frac{x^k}{k!} + ...\]

And what additional operations are there?

The two new operations that we want to use are substitution, and inversion.

And what are these?

If we have two functions, \(f\) and \(g\), with values \(f(x)\) and \(g(x)\) at argument \(x\), we can construct a new function, which we write as \(f(g)\), that is gotten by taking the value of \(g\) at argument \(x\), as the argument of \(f\).

The value of \(f(g)\) at \(x\), which we write as \(f(g(x))\), is the value of \(f\) at argument given by the value of \(g\) at \(x\); it is the value of \(f\) at argument \(g(x)\). We call this new function the substitution of \(g\) into \(f\). We'll get to inversion in Chapter 8.

Substitution is simpler than it sounds. Suppose you have a value for \(x\) on a spreadsheet in box A5, and you put =g(A5) in box B5, and =f(B5) in C5. Then C5 will contain \(f(g(x))\).

If you substitute a polynomial into a polynomial, you just get a polynomial, and if you substitute a rational function into a rational function, you still have a rational function. But if you substitute these things into exponentials and sines you get entirely new things (like \(\exp(-cx^2)\)) which is the basic function of probability theory.

Just as utilizing copies of the exponential or sine functions presents no problem to a spreadsheet or scientific calculator, substitution presents no real problem. We have seen that you can create g(A10) in B10, and then f(B10) in C(10) and you have created the substituted value f(g(A10)) in C10. You can, by repeating this procedure, construct the most horrible looking combination of substitutions and arithmetical operations imaginable, and even worse than you could imagine, with very little difficulty, and you can find their numerical derivatives as well.

Before we go on to the last operation, we note that there is a great property associated with the operation of substitution. Just as we have found formulae above for finding the derivative of a sum or product or ratio of functions whose derivatives we know, we have a neat formula for the derivative of a substitution function in terms of the the derivatives of its constituents. Actually it is about as simple a formula for this as could be.

The result is often called the chain rule:

The derivative \(f(g(x))\) with respect to \(x\) at some argument \(z\), like any other derivative, is the slope of the straight line tangent to this function, at argument \(z\). This slope, like all slopes, is the ratio of the change in the given function to a change in its argument, in any interval very near argument \(z\). Thus the derivative of \(f\) here is the tiny change in \(f\) divided by the change in \(g\). Substituting changes the denominator to the tiny change in \(x\).

Suppose then, we make a very small change in the variable \(x\), very near to \(x = z\), a change that is sufficiently small that the linear approximation to \(g\) and to \(f(g)\) are extremely accurate within the interval of change. Let us call that change \(dx\). This will cause a change in \(g(x)\) of \(g'(z)dx\), (because the definition of \(g'(z)\) is the ratio of the change of \(g\) to the change of \(x\) for \(x\) very near to \(z\).)

If \(g'(z)\) is 0, then g will not change and neither will \(f(g(x))\), when \(f\) depends on \(x\) only in that its argument \(g\) depends on \(x\). (If f has other dependence on \(x\) the contribution to its derivative from that other dependence gets added to the contribution from the change coming from the change in \(g\) and is irrelevant here.)

If \(g'(z)\) is not \(0\), we can define \(dg\) to be \(\frac{dg}{dx}\), and use the fact that the change in \(f\) for arguments near \(g(z)\) is given by \(df = \frac{df}{dg}dg\) which becomes

\(\frac{df}{dg}\frac{dg}{dx}dx\), where \(\frac{df}{dg}\) is evaluated at \(g(z)\) and \(\frac{dg}{dx}\) is evaluated at \(x=z\).

It follows from this remark that the chain rules reads

\[\frac{df(g(x))}{dx} = \frac{df(g)}{dg}\frac{dg(x)}{dx}\]

In words, this means that the derivative of the substituted function with values \(f(g(x))\), with respect to the variable \(x\) is the product of the derivatives of the constituent functions \(f\) and \(g\), taken at the relevant arguments: which are \(x\) itself for \(g\) and \(g(x)\) for \(f\).

How about some examples?

We will give two examples, but you should work out at least a dozen for yourself.

Example 1: Suppose we substitute the function \(g\) which has values given by \(g(x) = x^2 + 1\) into the function \(f\) which takes values \(f(x) = x^3 - 3\).

The substituted function \(f(g)\) has values \(f(g(x)) = (x^2 + 1)^3 - 3\).
Let us compute the derivative of this function. The derivative of \(f(s)\) with respect to \(s\) is \(3s^2\), while the derivative of \(g(x)\) with respect to \(x\) is \(2x\).
If we set \(s = g(x)\) which is \(x^2 + 1\), and take the product of these two we get:

\[\frac{d((x^2 + 1)^3 - 3)}{dx} = (3s^2)(2x) = 3g(x)^2(2x) = 6x(x^2 + 1)^2\]

You could multiply the cube here out and then differentiate to get the same result, but that is much messier, and most people would make at least one mistake in doing it. You have a chance of getting such things right even the first time, if you do them by the chain rule. (Unfortunately, if you do it correctly , you will not get any practice debugging from it.)

Example 2: Find the derivative of the function \(\exp(\frac{-x^2}{2})\).

This is the function obtained by substituting the function \(\frac{-x^2}{2}\) into the exponential function.
The derivative of the function \(\frac{-x^2}{2}\) is the function \(-x\); the exponential function is its own derivative.
On applying the chain rule we find: that the derivative of \(\exp(\frac{-x^2}{2})\) is \((-x)\exp(\frac{-x^2}{2})\), the latter factor being the derivative of the exponential function evaluated at \(\frac{-x^2}{2}\).

Exercises:

6.1 Write an expression for the result of substituting \(g\) into \(f\) to form \(f(g)\) for the following pairs of functions, and find expressions for their derivatives using the chain rule.

a. \(f\) defined by \(f(x) = \frac{x^2+1}{x}\), \(g\) defined by \(g(x)= x^2 - 1\).

b. \(f\) defined by \(f(x) = -x\), \(g\) by \(g(x) = \exp(x)\).

c. \(f\) defined by \(f(x) = \exp(x)\), \(g\) by \(g(x) = -x\).

6. 2 Check each of your results using the derivative applet.

6.3

a. Consider the function defined by the formula \(x^4 - 2x + 3\). Use the applet to plot it and see its derivative. Where is its minimum value, and what is it? What is its derivative at the minimum point? Estimate these things from the applet.

b. Find the maximum point for \(f\) and the value of \(f\) at that argument approximately for \(f\) defined by \(f(x) = x^2exp(-x)\),

c. If a function \(f\) is differentiable in the interval from \(a\) to \(b\) and has a minimum value at a point \(c\) in between \(a\) and \(b\), what is its derivative at \(c\)?

6.4 Use the chain rule to show : \(\exp(x + y) = \exp(x)\exp(y)\).

OK, where am I now?

At this point you have rules that enable you to differentiate all functions that you can make up using arithmetic operations and substitutions starting with the identity function (\(f(x) = x\)) or with the mysterious exponential function, \(f(x) = \exp(x)\).
In the next section we will extend things so you can start with the sine function, \(f = \sin x\) as well and differentiate anything you can create. Finally we will extend the rules to differentiating inverse functions as well.

What is this \(\exp(x)\)?

The number \(\exp(1)\) is called \(e\). The property: \(\exp(x + y)= \exp(x)\exp(y)\) implies that \(\exp(n)\) is \(e^n\), because \(\exp(n)\) is the product of \(\exp(1)\) \(n\) times. We have not as yet defined \(e^a\) when \(a\) is not an integer. When we do define it, we will find that \(\exp(z)\) is \(e^z\) for all real or complex numbers \(z\). Actually we will define \(e^x\) explicitly for rational values of \(x\), and show that it is \(\exp(x)\) and then define \(e^x\) for irrational values to be \(\exp(x)\).

And what is \(e\)?

One easy way to answer this question is to write \(e^x\) as a sum of powers of \(x\) multiplied by appropriate coefficients, and then set \(x = 1\). We can figure out the coefficients of each power in the sum by requiring that its derivative be the previous term.

Thus, we know by definition that \(\exp(0)\), the constant term in the sum, is 1. For \(\exp(x)\) to be its own derivative, it must contain something whose derivative is this constant term, \(1\). The term whose derivative is \(1\) is \(x\); the term whose derivative is \(x\) is \(\frac{x^2}{2}\); the term whose derivative is \(\frac{x^2}{2}\) is \(\frac{x^3}{3!}\), where \(n!\) is \(n\) multiplied by \((n-1)!\). And the general term in the sum that is \(e^x\) is \(\frac{x^n}{n!}\). (We already proved this but I like it so much I am repeating it.)

This tells us that \(e\) is \(1 + 1 + \frac{1^2}{2!} + ... + \frac{1}{n!} + ...\)

Exercise 6.5 Sum the first 18 terms of this series using a spreadsheet .

I get something like \(2.718281828459...\) for the number \(e\). It turns out that \(e\) is not rational or even a solution to a polynomial equation. Such numbers are called transcendental.

And how is \(e^x\) defined when \(x\) is not an integer?

When \(x\) is rational, say \(\frac{n}{m}\), \(e^x\)is the \(m^{th}\) root of \(e^n\). Otherwise it is defined by the endless power series proven above:

\[e^x = \sum_{n=0}^{\infty}\frac{x^n}{n!}\]