Skip to main content

Section 3.3 A first necessary condition

As in our study of the problems of determining shortest distances let us suppose that a particular admissible arc \(E_{12}\) with the equation

\begin{equation*} y=y(x)\quad (x_1 \leq x \leq x_2) \end{equation*}

actually furnishes a minimum for the integral \(I\text{,}\) and let us then seek to determine its properties. If \(\eta(x)\) is an admissible function having \(\eta(x_1) = \eta(x_2) =0\) then the family of arcs

\begin{equation} y=y(x)+a\eta(x)\quad (x_1 \leq x \leq x_2)\label{eqn-3-6}\tag{3.3.1} \end{equation}

contains \(E_{12}\) for the parameter value \(a=0\text{,}\) and for small values of \(a\) consists entirely of admissible arcs passing through the points 1 and 2. We must not let \(a\) be too large, as otherwise the corresponding curve of the family might lie partly above the line \(y = \alpha\text{.}\)

Among the values

\begin{equation} I(a)=\int_{x_{1}}^{x_{2}} f\left(y+a\eta, y^{\prime}+a \eta^{\prime}\right)\, d x\label{eqn-3-7}\tag{3.3.2} \end{equation}

of the integral \(I\) along the arcs of the family (3.3.1) the particular value \(I(0)\text{,}\) which is the value along \(E_{12}\text{,}\) must be a minimum, and we have therefore the necessary condition \(I'(0)=0\text{.}\)

Activity 3.3.1.

This is about to be a lot like what we did in Section 2.2.

  1. How is equation (3.3.2) similar to equation (2.2.2)? How is it different?

  2. We'll need to use the Leibniz rule like we did in Activity 2.2.1, but this time, we need to use the multivariable chain rule. Draw a tree diagram representing the intermediate relationships between \(f\) and \(a\text{.}\)

  3. Use your tree diagram, and the chain rule ideas from steps 3, 4, and 5 in Activity 2.2.1, to compute \(\frac{\partial f}{\partial a}\text{.}\)

  4. Conclude by writing down a final expression for \(I'(0)\text{.}\)

The value of the derivative \(I'(0)\) found by differentiating equation (3.3.2) with respect to \(a\) and then setting \(a=0\text{,}\) is

\begin{equation} I'(0)=\int_{x_1}^{x_2} \left[ f_y \eta + f_{y'} \eta' \right]\,dx,\label{eqn-3-8}\tag{3.3.3} \end{equation}

where \(f_y\) and \(f_{y'}\) are the partial derivatives of \(f(y, y')\) with the arguments \(y, y'\) belonging to the minimizing arc \(E_{12}\text{.}\)

Activity 3.3.2.

There's two ways to achieve the next important result. The book suggests one way, and I'll suggest another. The subscripts here are going to get a little hairy, so I suggest rewriting (3.3.3)in Leibniz notation:

\begin{equation*} I'(0)=\int_{x_1}^{x_2} \left[ \frac{\partial f}{\partial y} \eta + \frac{\partial f}{\partial y'} \eta' \right]\,dx \end{equation*}
  1. Rewrite (3.3.3) as the sum of two integrals. The second one, involving \(\eta'\text{,}\) is a natural candidate for integration by parts. Do that, then combine the two integrals again; you should now be able to factor out an \(\eta\text{.}\)

    Solution
    \begin{equation*} I'(0) = \int_{x_1}^{x_2} \left[\frac{\partial f}{\partial y} - \frac{d}{dx}\frac{\partial f}{\partial y'}\right] \eta \, dx \end{equation*}
  2. We know \(I'(0) =0\text{,}\) no matter which variation \(\eta\) you choose, so that means we can invoke the corollary to Lemma 2.3.1 that we proved in Activity 2.3.2. What can you conclude?

  3. Another way you could rewrite \(I'(0)\) -- and the way the book does it -- is by messing with the first integral. Pretend it's not a definite integral for a hot second, and just use integration by parts to come up with an antiderivative. (Hint: Let \(u = \eta\text{,}\) and write down some reasonable thing for the antiderivative of \(dv\text{,}\) even though you don't know exactly what it is.)

  4. Take the thing you found for the first integral and differentiate it with respect to \(x\text{.}\)

If we make use of the easily derived formula

\begin{equation*} \eta f_{y}=\frac{d}{dx}\left(\eta \int_{x_{1}}^{x} f_{y} \, dx\right)-\eta^{\prime} \int_{x_{1}}^{x} f_{y}\,dx \end{equation*}

and the fact that \(\eta(x_1) = \eta(x_2) = 0\text{,}\) the expression (3.3.3) takes the form

\begin{equation*} I^{\prime}(0)=\int_{x_{1}}^{x_{2}}\left\{f_{y^{\prime}}-\int_{x_{1}}^{x} f_{y} d x\right\} \eta^{\prime}\, dx. \end{equation*}

This must vanish for every family of the form (3.3.1), i.e., for every admissible function \(\eta(x)\) having \(\eta(x_1) = \eta(x_2) = 0\text{,}\) and we find ourselves again in a position to make use of the fundamental lemma of Section 2.3. From that lemma it follows that:

The last equation may be readily deduced from equation (3.3.4) by differentiation, since on a sub-arc of \(E_{12}\) where the tangent turns continuously the function \(f_y\) is continuous and the integral in (3.3.4) has the derivative \(f_y\text{.}\)

The equation (3.3.5) is the famous differential equation deduced by Euler in 1744 and called after him Euler's differential equation. Its solutions have been named extremals because they are the only curves which can give the integral \(I\) a maximum or a minimum, i.e., an extreme value. We shall in the following pages apply the term extremal only to those solutions \(y(x)\) which have continuously turning tangents and continuous second derivatives \(y''(x)\text{.}\)

Remark 3.3.2.

Most authors write Euler's equation, also sometimes called the Euler-Lagrange equation, in a slightly different form. Assuming that \(y \in C^2\) (that is, \(y''(x)\) is continuous):

\begin{equation*} \frac{\partial f}{\partial y} - \frac{d}{dx} \left(\frac{\partial f}{\partial y'}\right)=0. \end{equation*}

Note well that the function \(f\) in this differential equation is in fact the integrand of whatever functional we're currently looking at (see the very general form (1.6.2) from Section 1.6). For instance, in the arc length problem, the functional is

\begin{equation*} I = \int_{x_1}^{x_2} \sqrt{1+(y')^2}\,dx \end{equation*}

so that \(f(x,y,y') = \sqrt{1+(y')^2}\text{.}\)

Activity 3.3.3. UPDATED DIRECTIONS.

The Euler-Lagrange equation involves differentiating \(f_{y'}\) with respect to \(x\text{.}\) Let's assume that we can take as many derivatives as we need -- in particular, let's assume that \(y \in C^2\) -- and compute this derivative to see if something simpler happens in the particular case where the integrand is of the form \(f(y, y')\text{.}\)

  1. Suppose you have some function \(g(y, y'),\) where \(y\) and \(y'\) are functions of \(x\text{.}\) Draw a tree diagram representing the relationships between \(g\text{,}\) \(y\text{,}\) \(y'\text{,}\) and \(x\text{.}\)

  2. Use the multivariable chain rule to write down an expression for \(\dfrac{dg}{dx}\text{.}\)

  3. In our case, \(f_{y'}\) -- or, in Leibniz notation, \(\dfrac{df}{dy'}\) -- is really just a function of \(y\) and \(y'\text{,}\) so it can play the role of \(g\) in the previous step. To help you understand what the book says in the next paragraph, use your result to write down an expression for \(\dfrac{d}{dx} \dfrac{\partial f}{\partial y'}\text{.}\)

  4. Compute \(\dfrac{d}{dx}\left(f-y'\dfrac{\partial f}{\partial y'}\right)\text{.}\) Use the chain rule to write out \(\dfrac{df}{dx}\text{.}\) Contrary to how the book is going to do it in the next paragraph, don't write out \(\dfrac{d}{dx} \dfrac{\partial f}{\partial y'}\text{!}\) Simplify a little; you should in particular be able to factor out \(y'\text{.}\)

  5. (New step) Apply the Euler-Lagrange equation to your result. What can you conclude about \(f-y'\dfrac{\partial f}{\partial y'}\text{?}\)

Solution

We have not so far made any assumption concerning the existence of a second derivative \(y''(x)\) along our minimizing arc. When there is one, however, we can carry out the differentiations indicated in equation (3.3.5) and obtain

\begin{equation*} \frac{d}{d x} f_{y^{\prime}}-f_{y}=f_{y^{\prime} y} y^{\prime}+f_{y^{\prime} y^{\prime}} y^{\prime \prime}-f_{y}=0 \end{equation*}

from which it follows that along a minimizing arc with a second derivative \(y''(x)\) we have

\begin{equation*} \frac{d}{d x}\left(f-y^{\prime} f_{y^{\prime}}\right)=y^{\prime}\left(f_{y}-f_{y^{\prime} y} y^{\prime}-f_{y^{\prime} y^{\prime}} y^{\prime \prime}\right)=0 \end{equation*}

and hence also

\begin{equation} f - y'f_{y'} = \text{constant.}\label{eqn-3-11}\tag{3.3.6} \end{equation}
Remark 3.3.3.

For me, these equations are hard to read in the subscript notation. Here they are in Leibniz notation instead, with some more parentheses added for clarity.

\begin{equation*} \frac{d}{dx} \left( \frac{\partial f}{\partial y'}\right) - \frac{\partial f}{\partial y} = \left[\frac{\partial}{\partial y } \left(\frac{\partial f}{\partial y'}\right) y' + \frac{\partial}{\partial y'} \left(\frac{\partial f}{\partial y'}\right) y'' \right] - \frac{\partial f}{\partial y} = 0 \end{equation*}

Here's the second one:

\begin{equation*} \frac{d}{dx} \left( f - y' \frac{\partial f}{\partial y'}\right) = y'\left[\frac{\partial f}{\partial y} - \frac{\partial}{\partial y }\left(\frac{\partial f}{\partial y'}\right) y' - \frac{\partial}{\partial y'}\left(\frac{\partial f}{\partial y'}\right) y'' \right] = 0 \end{equation*}

And since the only thing whose derivative is zero is a constant function,

\begin{equation*} f - y' \frac{\partial f}{\partial y'} = \text{constant.} \end{equation*}

The last equation here is called the Beltrami identity; it's a special case of the Euler-Lagrange equation. Note well that these equations only work when \(x\) doesn't appear in the integrand of the functional -- that is, when the integrand is of the form \(f(y, y')\text{.}\)

The reasoning by means of which equation (3.3.6) has been derived is valid not only for the particular integrand function

\begin{equation*} f(y, y') = \sqrt{\frac{1+(y')^2}{y-\alpha}} \end{equation*}

of the brachistochrone problem but also for an arbitrary function \(f(y, y')\) of the two variables \(y\) and \(y'\) which with its derivatives has suitable continuity properties. One can further verify readily that the proofs of equations (3.3.4) and (3.3.5) hold without alteration not only for this case but also for the still more general integral \(I\) to be studied in Chapter V for which the integrand is assumed to be a function \(f(x, y, y')\) of the three variables \(x, y, y'\text{.}\) It is evident, therefore, that the results of this section have great generality and that they may be applied to a wide variety of problems in the calculus of variations. One should note that equation (3.3.6) cannot be expected to hold when \(x\) occurs in the integrand function, since in the differentiations made to obtain it the function \(f\) was supposed to contain only the variables \(y\) and \(y'\text{.}\)

Activity 3.3.4.

The fun thing about special cases is that things are generally way simpler. Let's see how this works in the case of minimizing arc length, which we considered using other methods in Section 2.4.

  1. Apply Euler's equation \(\displaystyle \frac{\partial f}{\partial y} - \frac{d}{dx} \left(\frac{\partial f}{\partial y'}\right)=0\) to the arc length integrand \(f(y, y') = \sqrt{1+(y')^2}\text{:}\)

    1. Compute \(\dfrac{\partial f}{\partial y}\text{.}\)

    2. Compute \(\dfrac{\partial f}{\partial y'}\text{.}\)

    3. Compute \(\dfrac{d}{dx}\left(\dfrac{\partial f}{\partial y'}\right)\text{.}\) (Yikes!)

    4. Combine the pieces appropriately.

  2. Now try the Beltrami identity \(f-y'\frac{\partial f}{\partial y'} = c\text{.}\) You've already computed all the pieces you need.

Activity 3.3.5.

Assuming \(y\in C^2\) (which is true for reasons we'll get to in the next section), we can apply the Beltrami identity (3.3.6) to the brachistochrone problem, since \(x\) doesn't appear in the integrand. 1 

  1. Write down \(f(y, y')\text{.}\)

  2. Compute \(\dfrac{\partial f}{\partial y'}\text{.}\)

  3. Assemble the pieces to produce a differential equation that the brachistochrone curve must satisfy. (If you get a common denominator, you can simplify a little.)

    Solution
    \begin{equation*} \sqrt{\frac{1+(y')^2}{y-\alpha}} - y'\cdot \left[\frac{y'}{\sqrt{y-\alpha}\sqrt{1+(y')^2}}\right] = c \end{equation*}

    After getting a common denominator, this can be simplified to

    \begin{equation*} \frac{1}{\sqrt{y-\alpha}\sqrt{1+(y')^2}} = c. \end{equation*}
We could also apply the Euler formula directly but it gets real messy, so we may as well use our nicer version. Try it out if you don't believe me; the issue is that \(\frac{d}{dx}\left(\frac{\partial f}{\partial y'}\right)\) has like three different terms in it.