Section 1.2 Maxima and minima
¶Among the earliest problems which attracted the attention of students of the calculus were those which require the determination of a maximum or a minimum. Fermat had devised as early as 1629 a procedure applicable to such problems, depending upon principles which in essence, though not in notation, were those of the modern differential calculus. Somewhat nearer to the type of reasoning now in common use are the methods which Newton and Leibniz applied to the determination of maxima and minima, methods which are also characteristic of their two conceptions of the fundamental principles of the differential calculus. Newton argued, in a paper written in 1671 but first published in 1736, that a variable is increasing when its rate of change is positive, and decreasing when its rate is negative, so that at a maximum or a minimum the rate must be zero. Leibniz, on the other hand, in a paper which he published in 1684, conceived the problem geometrically. At a maximum or a minimum point of a curve the tangent must be horizontal and the slope of the tangent zero.
At the present time we know well that from a purely analytical standpoint these two methods are identical. The derivative
of a function \(f(x)\) represents both the rate of change of \(f(x)\) with respect to \(x\) and the slope of the tangent at a point on the graph of \(f(x)\text{.}\) For in the first place the fraction in the second member of (1.2.1) is the average rate of change of \(f(x)\) with respect to \(x\) on the interval from \(x\) to \(x + \Delta x\text{,}\) and its limit as the interval is shortened is therefore rightly called the rate of change of \(f(x)\) at the initial value \(x\) of the interval. In the second place this same quotient is the slope of the secant \(PQ\) in Figure 1.2.1, and its limit is the slope of the tangent at \(P\text{.}\) Thus by the reasoning of either Newton or Leibnitz we know that the maxima and minima of \(f(x)\) occur at the values of \(x\) where the derivative \(f'(x)\) is zero.
It was not easy for the seventeenth-century mathematician to deduce this simple criterion that the derivative \(f'(x)\) must vanish at a maximum or a minimum of \(f(x)\text{.}\) He was immersed in the study of special problems rather than general theories, and had no well-established limiting processes or calculus notations to assist him. It was still more difficult for him to advance one step farther to the realization of the significance of the second derivative \(f''(x)\) in distinguishing between maximum and minimum values. Leibniz in his paper of 1684 was the first to give the criterion. In present-day parlance we say that \(f'(a) = 0, \, f''(a) \geq 0\) are necessary conditions for the value \(f(a)\) to be a minimum, while the conditions \(f'(a) = 0, \, f''(a) \gt 0\) are sufficient to insure a minimum. For a maximum the inequality signs must be changed in sense.
Remark 1.2.2.
The words "necessary" and "sufficient" are used a lot in mathematics, though they are perhaps a little old-fashioned. The statement "\(P\) is sufficient for \(Q\)" means "if \(P\) then \(Q\)", or \(P \to Q\text{.}\) The statement "\(P\) is necessary for \(Q\)" means "if \(Q\) then \(P\)", or \(Q\to P\text{.}\) If \(P\) is both necessary and sufficient for \(Q\text{,}\) then you're living in if-and-only-if land: that is, \(P\) and \(Q\) are logically equivalent, or \(P\leftrightarrow Q\text{.}\)
Here's an example that will help you think about this: Let's consider the relationship between squares and rectangles.
If we knew a shape was a square, we would definitely know it was a rectangle -- that is, "the shape is a square" is enough information to let you conclude that "the shape is a rectangle". So, "square" is sufficient for "rectangle".
If we knew the shape was a rectangle, we couldn't a priori be sure that it was a square, because there are rectangles that aren't squares. So, "rectangle" is not sufficient for "square".
However, if the shape wasn't a rectangle, then it definitely wouldn't be a square. Therefore, "rectangle" is necessary for "square".
But the opposite isn't true. If the shape wasn't a square, it might still be a rectangle. (It might also be a triangle or a hexagon or whatever, of course.) So "square" is not necessary for "rectangle".
Don't be worried if you have to think hard about this every time you see these words. I do too; I had to think really carefully while writing this whole remark. :)
Activity 1.2.1.
As the book is about to say, for the problem of determining a minimum of a function, "it will be noted that the conditions just stated as necessary for a minimum are not identical with those which are sufficient."
What's the difference between the necessary condition and the sufficient condition? I claim that one condition is "stronger" in some sense; which is it?
If the conditions \(f'(a) = 0, \, f''(a) \geq 0\) are, as claimed, necessary but not sufficient for \(f(a)\) to be a minimum, then there must be some function where those conditions hold but you don't have a minimum. Give me an example of such a function.
Write a sentence or two explaining why the necessary condition isn't sufficient, and how we have to "strengthen" it a little to get a sufficient condition.
It will be noted that the conditions just stated as necessary for a minimum are not identical with those which are sufficient. We shall see in Chapter V that a similar undesirable and much more baffling discrepancy occurs in the calculus of variations. For the simple problem of minimizing a function \(f(x)\) the doubtful intermediate case when \(f'(a)\) and \(f''(a)\) are both zero was discussed by Maclaurin (1698-1746) who showed how higher derivatives may be used to obtain criteria which are both necessary and sufficient. For the calculus of variations the corresponding problem offers great difficulty and has never been completely solved.