Lagrangian Mechanics

Introduction

Lagrangian mechanics is a formulation of classical physics that is an alternative to Newtonian Mechanics. In Lagrangian mechanics, given an entity, the Lagrangian, L, if one minimizes the trajectory of a particle through phase space (a plot of position versus time), then the equation that describes this trajectory can by used to derive Newton’s equations of motion. One of the advantages of the Lagrangian system is that the equations of classical physics all stem from a single expression – the Lagrangian. Another important feature of Lagrangian mechanics is that its methodology can be adapted to advanced theories in physics such as relativity, quantum mechanics and quantum field theory.

In the following section, we will derive the Euler-Lagrange equation, the key equation in this formulation.

Derivation of Euler-Lagrange equation

Adapted from Khan Faculty

Particles in nature follow a specific trajectory through space which translated to a specific trajectory on a phase space diagram (which, as mentioned above, is a plot of position versus time). If we specify two points in phase space, a=(x_1, y(x_1)) and b=(x_2,y(x_2)), there are an infinite number of possible paths that a particle can take in traveling from a to b.

Yet , given a specific set of initial conditions (position and velocity) and an environment of potential energy (which may vary with time and creates forces), the particle takes only one of these paths. The solution of the Euler-Lagrange equation elucidates the laws that determine this path. Here is its derivation:

The problem we want to solve can be stated as follows:

Find x=l(t) such that the functional

S=\int\limits_{t_1}^{t_2}L(t,x,x^\prime)dt

is stationary given the boundary conditions x(t_1)=t_1 and x(t_2)=t_2.

Here are two important points to clarify before we really get into it:

First, what is a functional? A functional is simply a function of a function. In the case we are developing the functional with which we will be concerned is the Lagrangian, which is a function of position, x(t), and velocity, \frac{dx}{dt}=x^\prime, which themselves are functions.

Second, what does it mean to be stationary?

Well, in single or multivariable calculus, it means that the value of a function is 1) a local maximum (i.e., its value rises to a local maximum value then decreases again) 2) a local minimum (i.e., its value decreases to a local minimum then increases again) or 3) is a saddle point (i.e., for a function of greater than 2 dimensions, is at a minimum in one direction and is at a maximum in another direction). Why is it called stationary? Because at every one of these 3 types of points, the slope of the function (which is its derivative) has to go from positive to negative or negative to positive. To do this, it has to go through a point where the value of its slope (=derivative) is zero. That point is the maximum, minimum or saddle point.

It has a slightly different (though kind-of related) meaning as it applies to the calculus of variations (which is what we’ll be doing in this derivation). In a calculus of variations problem, the goal is to find the equation of the shortest pathway between two points. As we shall see, the way that this is done is to find a curve where the sum of the deviations from the optimal path is zero. This condition of zero deviation from the optimal pathway is what it means for a functional to be stationary.

Now, on to the meat of the derivation.

Let’s assume that a function x(t) makes S stationary and satisfies the above boundary conditions (x(t_1)=t_1 and x(t_2)=t_2).

Introduce a function, \eta(t), which we’ll see in a minute, will, create a deviation from x(t) in functionals. Since the functionals have to end on the same endpoints as x(t), \eta(t_1)=0 and \eta(t_2)=0 (i.e., the deviation of the functional from the endpoints is zero).

Define a series of functionals (which describe an infinite number of curves), \bar{x}(t)=x(t)+\epsilon\eta(t). These functionals satisfy the same boundary conditions as x(t). Note also that all of the functions we’ve defined here are continuous functions that have a second derivative that’s defined.

What we want to do is find the particular \bar{x}(t) that makes S(e)=\int\limits_{t_1}^{t_2}L(t,\bar{x},\bar{x}^\prime)dt stationary. We deal with S(\epsilon) because \epsilon is the only thing that changes the value of of S. Why?

  • First, t doesn’t change S because it’s an independent variable which gets “integrated out” of the expression for S.
  • Second, \bar{x}(t)=x(t)+\epsilon\eta(t) . Both \bar{x}(t) and \eta(t) are “fixed” functions along t.
  • Therefore, the only thing that will change the value of \bar{x}(t), and thus 1) create different curves that we can check to see if that curve is the stationary curve, x(t) and 2) in turn, change the value of S, is \epsilon.

To make S(\epsilon) stationary, what we have to do is set its derivative equal to zero: \frac{d S(\epsilon)}{de}=0

If \frac{d S(\epsilon)}{d\epsilon}=0 that means that

\begin{array}{rcl}  \frac{d S(\epsilon)}{d\epsilon}&=&\int\limits_{t_1}^{t_2}L(t,\bar{x},\bar{x}^\prime)=0\\  &=&\int\limits_{t_1}^{t_2}\frac{\partial}{\partial \epsilon}L(t,\bar{x},\bar{x}^\prime)=0  \end{array}

We apply the chain rule to the derivative inside the integral sign:

\int\limits_{t_1}^{t_2}\left[  \frac{\partial L}{\partial \bar{x}}\frac{\partial\bar{x}}{\partial \epsilon} + \frac{\partial L}{\partial\bar{x}^\prime}\frac{\partial\bar{x}^\prime}{\partial\epsilon}   \right]dt = 0

Next we need to find the derivatives of \bar{x} and \bar{x}^\prime:

\begin{array}{rcl}  \bar{x}(t)&=&x(t) + \epsilon\eta(t)\\ \bar{x}^\prime(t)&=&x^\prime(t) + \epsilon\eta^\prime(t)\\ \frac{\partial \bar{x}}{\partial \epsilon} &=& \eta\\ \frac{\partial \bar{x}^\prime}{\partial \epsilon} &=& \eta^\prime  \end{array}

Substituting the values for \frac{\partial \bar{x}}{\partial \epsilon} and \frac{\partial \bar{x}^\prime}{\partial \epsilon} we just calculated:

\int\limits_{t_1}^{t_2}\left[  \frac{\partial L}{\partial \bar{x}}\eta + \frac{\partial L}{\partial\bar{x}^\prime}\eta^\prime   \right]dt = 0

We can apply integration by parts to the term \frac{\partial L}{\partial\bar{x}^\prime}\eta^\prime. In integration by parts:

\int u dv = uv - \int v du \tag{3.1}

Let u=\frac{\partial L}{\partial\bar{x}^\prime} and v=\eta. Then

\int\limits_{t_1}^{t_2}\frac{\partial L}{\partial\bar{x}^\prime}\eta^\prime dt = \frac{\partial L}{\partial \bar{x}^\prime}\int\limits_{t_1}^{t_2} \eta^\prime dt - \int\limits_{t_1}^{t_2} \eta \frac{d}{dt}\frac{\partial L}{\partial \bar{x}^\prime}dt

But \int \eta^\prime dt=\eta. Thus, \int\limits_{t_1}^{t_2} \eta^\prime dt=\eta(t_2)-\eta(t_1). From our definition of \eta(t) above, \eta(t_1)=\eta(t_2)=0. Therefore, \int\limits_{t_1}^{t_2} \eta^\prime dt=0. That leaves us with

\begin{array}{rcl}  \int\limits_{t_1}^{t_2}\frac{\partial L}{\partial\bar{x}^\prime}\eta^\prime dt &=& \frac{\partial L}{\partial \bar{x}^\prime}\int\limits_{t_1}^{t_2} \eta^\prime dt - \int\limits_{t_1}^{t_2} \eta \frac{d}{dt}\frac{\partial L}{\partial \bar{x}^\prime}dt\\  \int\limits_{t_1}^{t_2}\frac{\partial L}{\partial\bar{x}^\prime}\eta^\prime dt &=& 0 - \int\limits_{t_1}^{t_2} \eta \frac{d}{dt}\frac{\partial L}{\partial \bar{x}^\prime}dt\\  \int\limits_{t_1}^{t_2}\frac{\partial L}{\partial\bar{x}^\prime}\eta^\prime dt &=& - \eta \frac{d}{dt}\frac{\partial L}{\partial \bar{x}^\prime}dt\\  \end{array}

Substituting in the last expression, we get:

\int\limits_{t_1}^{t_2}\left[  \frac{\partial L}{\partial \bar{x}}\eta - \eta \frac{d}{dt}\frac{\partial L}{\partial \bar{x}^\prime}   \right]dt = 0

Factor out \eta:

\int\limits_{t_1}^{t_2}\left[  \frac{\partial L}{\partial \bar{x}} - \frac{d}{dt}\frac{\partial L}{\partial \bar{x}^\prime}   \right]\eta dt = 0

When \epsilon=0

\bar{x}(t)&=&x(t) + \epsilon\eta(t)=x(t) + 0\eta(t)=x(t)

and

\bar{x}^\prime(t)&=&x^\prime(t) + \epsilon\eta^\prime(t)=\bar{x}^\prime(t)&=&x^\prime(t) + 0\eta^\prime(t)=x^\prime(t).

Therefore, the above integral becomes:

\int\limits_{t_1}^{t_2}\left[  \frac{\partial L}{\partial x} - \frac{d}{dt}\frac{\partial L}{\partial x^\prime}   \right]\eta dt = 0

So we’ve met our criteria for making S stationary:

\frac{d S(\epsilon)}{d \epsilon}=0     and
\epsilon=0

There’s just one thing left to do.

In order for the above integral to equal zero, the quantity inside the integral has to equal zero. That quantity is the product of two entities, [\cdots]\eta and this product must equal zero. But \eta is a nonzero function. Therefore, in order for the product to be zero, the stuff inside the brackets has to equal zero:

(1)   \begin{equation*}\frac{\partial L}{\partial x} - \frac{d}{dt}\frac{\partial L}{\partial x^\prime}=0\end{equation*}

This equation is the Euler-Lagrange equation.

Application of the Euler-Lagrange equation to physics

We derived the following equation and identified it as the Euler-Lagrange equation:

\frac{\partial L}{\partial x} - \frac{d}{dt}\frac{\partial L}{\partial x^\prime}=0

Furthermore, we’ve said that this equation is derived from the fact that an entity, S=\int\limits_{t_1}^{t_2}L(t,x,x^\prime)dt does not change.

As it applies to physics:

  • S is called the action.
  • L(t,x,x^\prime) is called the Lagrangian of a system.
  • \delta S=\delta\int\limits_{t_1}^{t_2}L(t,x,x^\prime)dt=0 is referred to as the principle of least action, or more appropriately, the principle of stationary action.

As stated at the outset of this article, Newtonian mechanics can be derived from this principle. The specific example we’ll use to illustrate this is derivation of Newton’s second law of motion from the Euler-Lagrange equation.

In classical mechanics, for a particle moving through a potential energy field, the Lagrangian, L, equals the kinetic energy of the particle minus the potential energy affecting that particle:

L=\underbrace{\frac12 m \dot{x}^2}_{\text{kinetic energy}} - \underbrace{V(x)}_{\text{potential energy}}     where

m = mass of the particle
\dot{x}  = \frac{dx}{dt} = x^\prime = velocity of the particle
V = potential energy, which is a function position, x

In these equations, we’re using \dot{x} to represent the time derivative of x instead of v^\prime or \frac{dx}{dt} because this is the convention used for the Euler-Lagrange equation in most textbooks and papers. Also, if we were considering a particle moving in 3-dimensional space, we would have to use partial derivatives to differentiate x and \dot{x}. However, in this example, we’ll considering a particle moving in only one direction of space – the x direction. Thus, we can use regular derivatives when differentiating x and \dot{x}. We still have to take partial derivatives when differentiating L though because L is a function of two variables.

\frac{\partial L}{\partial x} - \frac{d}{dt}\frac{\partial L}{\partial \dot{x}}=0\quad\Rightarrow\quad\frac{\partial L}{\partial x}=\frac{d}{dt}\frac{\partial L}{\partial \dot{x}}

The only term that depends on x is the potential energy term, V(x). Therefore,

\frac{\partial L}{\partial x}=-\frac{d V(x)}{d x}=\text{force}

The only term that depends on \dot{x} is the kinetic energy term, {\frac12 m \dot{x}^2}. Therefore,

\frac{\partial L}{\partial \dot{x}} = \frac12 m \frac{d\, \dot{x}^2}{d \dot{x}}=2\cdot\frac12 m \dot{x} = m \dot{x} = \text{momentum}

That means that

\begin{array}{rcl}  \frac{d}{dt}\frac{\partial L}{\partial \dot{x}}=\frac{\partial L}{\partial x}&\Rightarrow& m \ddot{x} = -\frac{d V(x)}{d x}\\&\Rightarrow& \text{mass} \times \text{acceleration}=\text{force}  \end{array}

which, of course, is Newton’s second law of motion.

The Euler-Lagrange equation is frequently written:

\frac{\partial L}{\partial q_i} - \frac{d}{dt}\frac{\partial L}{\partial \dot{q_i}}=0

In this equation, q_i and \dot{q_i}} are called generalized coordinates. They could be any kind of coordinates (e.g., polar, cylindrical, spherical, etc.). q_i represents spatial position and \dot{q_i}} represents velocity based on such coordinates. The i subscripts specify individual particles. To describe an entire ensemble of particles, we would need one equation for each particle.

Note that, although the equations of motion for some physical systems may be quite complicated, the principle of stationary action can be applied to all of them, including systems that require such advanced descriptions as general relativity, quantum mechanics, quantum field theory and string theory . For such varied systems, the Lagrangian may take many different forms. How are these Lagrangians determined? Sometimes from some previous knowledge; sometimes from an educated guess based on some physical theory; sometimes from results of experiments – in short, by whatever method works.

The simple Lagrangian that describes particles moving through a potential energy field that we worked with above is a case in point. As far as I can tell, there is no specific intuition to explain the form of this expression except that it works.

There is some intuition, however, to explain why the stationary action principle works. This intuition stems from quantum mechanics and is well-explained in an article from UVA. The argument goes something like this:

According to quantum mechanics, particles aren’t really particles. They’re waves. At least, the probability of measuring them in a given location is described as a wave. Waves that are near each other have wavefronts that are nearly in phase and interfere with each other constructively. On the other hand, wavefronts that are farther away tend to be out of phase and interfere with each other destructively. The locations where the waves interfere constructively are the locations where the light (or other “particle”) is most likely to be found. These locations constitute the path described by the stationary action principle.