# Sturm–Liouville theory

In mathematics and its applications, classical Sturm–Liouville theory, named after Jacques Charles François Sturm (18031855) and Joseph Liouville (18091882), is the theory of real second-order linear differential equations of the form:

${\displaystyle (1)\qquad \qquad {\frac {\mathrm {d} }{\mathrm {d} x}}\left[p(x){\frac {\mathrm {d} y}{\mathrm {d} x}}\right]+q(x)y=-\lambda w(x)y,}$

for given coefficient functions p(x), q(x), and w(x) > 0 and an unknown function y of the free variable x. The function w(x), sometimes denoted r(x), is called the weight or density function.

In the simplest case where all coefficients are continuous on the finite closed interval [a,b] and p has continuous derivative, a function y is called a solution if it is continuously differentiable on (a,b) and satisfies the equation (1) at every point in (a,b). (In the case of more general p(x), q(x), w(x), the solutions must be understood in a weak sense.) In addition, y is typically required to satisfy some boundary conditions at a and b. Each such equation (1) together with its boundary condtions constitutes a Sturm-Liouville (S-L) problem.

The value of λ is not specified in the equation: finding the λ for which there exists a non-trivial solution is part of the given S-L problem. Such values of λ, when they exist, are called the eigenvalues of the problem, and the corresponding solutions are the eigenfunctions associated to each λ. This terminology is because the solutions correspond to the eigenvalues and eigenfunctions of a Hermitian differential operator in an appropriate function space. Sturm-Liouville theory studies the existence and asymptotic behavior of the eigenvalues, the corresponding qualitative theory of the eigenfunctions and their completeness in the function space.

This theory is important in applied mathematics, where S-L problems occur very commonly, particularly when dealing with separable linear partial differential equations. For example, in quantum mechanics, the one-dimensional time-independent Schrödinger equation is a S-L problem.

A Sturm-Liouville problem is said to be regular if p(x), w(x) > 0, and p(x), p′(x), q(x), w(x) are continuous functions over the finite interval [a,b], and the problem has separated boundary conditions of the form:

${\displaystyle (2)\qquad \qquad \alpha _{1}y(a)+\alpha _{2}y'(a)=0\qquad \qquad \alpha _{1}^{2}+\alpha _{2}^{2}>0,}$

${\displaystyle (3)\qquad \qquad \beta _{1}y(b)\,+\,\beta _{2}y'(b)=0\qquad \qquad \beta _{1}^{2}+\beta _{2}^{2}>0.}$

The main result of Sturm–Liouville theory states that, for the regular Sturm–Liouville problem (1),(2),(3):

• The eigenvalues λ1, λ2, λ3, ... are real and can be numbered so that
${\displaystyle \lambda _{1}<\lambda _{2}<\lambda _{3}<\cdots <\lambda _{n}<\cdots \to \infty ;}$
• Corresponding to each eigenvalue λn is a unique (up to constant multiple) eigenfunction yn(x) with exactly n−1 zeros in (a,b), called the nth fundamental solution.
• The normalized eigenfunctions form an orthonormal basis under the w-weighted inner product in the Hilbert space ${\displaystyle L^{2}([a,b],w(x)\,dx)}$. That is:
${\displaystyle \langle y_{n},y_{m}\rangle =\int _{a}^{b}y_{n}(x)y_{m}(x)w(x)\,\mathrm {d} x=\delta _{mn},\ \ }$ where δmn is the Kronecker delta.

## Sturm–Liouville form

The differential equation (1) is said to be in Sturm–Liouville form or self-adjoint form. All second-order linear ordinary differential equations can be recast in the form on the left-hand side of (1) by multiplying both sides of the equation by an appropriate integrating factor (although the same is not true of second-order partial differential equations, or if y is a vector). Some examples are below.

### Bessel equation

${\displaystyle x^{2}y''+xy'+\left(x^{2}-\nu ^{2}\right)y=0}$

which can be written in Sturm–Liouville form (first by dividing through by x, then by collapsing the two terms on the left into one term) as

${\displaystyle \left(xy'\right)'+\left(x-{\frac {\nu ^{2}}{x}}\right)y=0.}$

### Legendre equation

${\displaystyle \left(1-x^{2}\right)y''-2xy'+\nu (\nu +1)y=0}$

which can easily be put into Sturm–Liouville form, since d/dx(1 − x2) = −2x, so the Legendre equation is equivalent to

${\displaystyle \left(\left(1-x^{2}\right)y'\right)'+\nu (\nu +1)y=0}$

### Example using an integrating factor

${\displaystyle x^{3}y''-xy'+2y=0}$

Divide throughout by x3:

${\displaystyle y''-{\frac {1}{x^{2}}}y'+{\frac {2}{x^{3}}}y=0}$

Multiplying throughout by an integrating factor of

${\displaystyle \mu (x)=\exp \left(\int -{\frac {1}{x^{2}}}\,\mathrm {d} x\right)=e^{\frac {1}{x}},}$

gives

${\displaystyle e^{\frac {1}{x}}y''-{\frac {e^{\frac {1}{x}}}{x^{2}}}y'+{\frac {2e^{\frac {1}{x}}}{x^{3}}}y=0}$

which can be easily put into Sturm–Liouville form since

${\displaystyle {\frac {\mathrm {d} }{\mathrm {d} x}}e^{\frac {1}{x}}=-{\frac {e^{\frac {1}{x}}}{x^{2}}}}$

so the differential equation is equivalent to

${\displaystyle \left(e^{\frac {1}{x}}y'\right)'+{\frac {2e^{\frac {1}{x}}}{x^{3}}}y=0.}$

### Integrating factor for general second-order equation

${\displaystyle P(x)y''+Q(x)y'+R(x)y=0}$

Multiplying through by the integrating factor

${\displaystyle \mu (x)={\frac {1}{P(x)}}\exp \left(\int {\frac {Q(x)}{P(x)}}\,\mathrm {d} x\right),}$

and then collecting gives the Sturm–Liouville form:

${\displaystyle {\frac {\mathrm {d} }{\mathrm {d} x}}{\bigl (}\mu (x)P(x)y'{\bigr )}+\mu (x)R(x)y=0,}$

or, explicitly:

${\displaystyle {\frac {\mathrm {d} }{\mathrm {d} x}}\left(\exp \left(\int {\frac {Q(x)}{P(x)}}\,\mathrm {d} x\right)y'\right)+{\frac {R(x)}{P(x)}}\exp \left(\int {\frac {Q(x)}{P(x)}}\,\mathrm {d} x\right)y=0.}$

## Sturm–Liouville equations as self-adjoint differential operators

The mapping defined by:

${\displaystyle Lu=-{\frac {1}{w(x)}}\left({\frac {\mathrm {d} }{\mathrm {d} x}}\left[p(x)\,{\frac {\mathrm {d} u}{\mathrm {d} x}}\right]+q(x)u\right)}$

can be viewed as a linear operator L mapping a function u to another function Lu, and it can be studied in the context of functional analysis. In fact, equation (1) can be written as

${\displaystyle Lu=\lambda u.}$

This is precisely the eigenvalue problem; that is, one seeks eigenvalues λ1, λ2, λ3,... and the corresponding eigenvectors u1, u2, u3,... of the L operator. The proper setting for this problem is the Hilbert space ${\displaystyle L^{2}([a,b],w(x)\,dx)}$ with scalar product

${\displaystyle \langle f,g\rangle =\int _{a}^{b}{\overline {f(x)}}g(x)w(x)\,\mathrm {d} x.}$

In this space L is defined on sufficiently smooth functions which satisfy the above regular boundary conditions. Moreover, L is a self-adjoint operator:

${\displaystyle \langle Lf,g\rangle =\langle f,Lg\rangle .}$

This can be seen formally by using integration by parts twice, where the boundary terms vanish by virtue of the boundary conditions. It then follows that the eigenvalues of a Sturm–Liouville operator are real and that eigenfunctions of L corresponding to different eigenvalues are orthogonal. However, this operator is unbounded and hence existence of an orthonormal basis of eigenfunctions is not evident. To overcome this problem, one looks at the resolvent

${\displaystyle \left(L-z\right)^{-1},\qquad z\in \mathbb {C} ,}$

where z is chosen to be some real number which is not an eigenvalue. Then, computing the resolvent amounts to solving the inhomogeneous equation, which can be done using the variation of parameters formula. This shows that the resolvent is an integral operator with a continuous symmetric kernel (the Green's function of the problem). As a consequence of the Arzelà–Ascoli theorem, this integral operator is compact and existence of a sequence of eigenvalues αn which converge to 0 and eigenfunctions which form an orthonormal basis follows from the spectral theorem for compact operators. Finally, note that

${\displaystyle \left(L-z\right)^{-1}u=\alpha u,\qquad Lu=\left(z+\alpha ^{-1}\right)u,}$

are equivalent, so we may take ${\displaystyle \lambda =z+\alpha ^{-1}}$ with the same eigenfunctions.

If the interval is unbounded, or if the coefficients have singularities at the boundary points, one calls L singular. In this case, the spectrum no longer consists of eigenvalues alone and can contain a continuous component. There is still an associated eigenfunction expansion (similar to Fourier series versus Fourier transform). This is important in quantum mechanics, since the one-dimensional time-independent Schrödinger equation is a special case of a S-L equation.

## Second order differential equation with boundary conditions

The operator L in the previous section can be written

${\displaystyle Lu={\frac {p}{w(x)}}u''+{\frac {p'}{w(x)}}u'+{\frac {q}{w(x)}}u.}$

If Ay″ + By′ + Cy = f(x) is an ordinary second order differential equation, and assuming for the sake of simplicity that A(x) ≠ 0, the first member can be transformed into a Sturm–Liouville operator L by solving the following system of equations:

${\displaystyle p=Aw,\quad p'=Bw,\quad q=Cw.}$

It suffices to solve the first two equations, which in turn amounts to solve (Aw)′ = Bw, or

${\displaystyle w'={\frac {B-A'}{A}}w:=\alpha w.}$

A solution is

${\displaystyle w=\exp \left(\int \alpha \,\mathrm {d} x\right),\quad p=A\exp \left(\int \alpha \,\mathrm {d} x\right),\quad q=C\exp \left(\int \alpha \,\mathrm {d} x\right).}$

Thus, the first member of the proposed equation can be transformed in a Sturm–Liouville operator L as in the previous section, and the equation can now be written as

${\displaystyle Ly=f.}$

In general, if initial conditions at some point are specified, for example y(a) = 0 and y′(a) = 0, a second order differential equation can be solved using ordinary methods and the Picard–Lindelöf theorem ensures that the differential equation has a unique solution in a neighbourhood of the point where the initial conditions have been specified.

But if in place of specifying initial values at a single point, it is desired to specify values at two different points (so-called boundary values), e.g. y(a) = 0 and y(b) = 1, the problem turns out to be much more difficult. Notice that by adding a suitable known differentiable function to y, whose values at a and b satisfy the desired boundary conditions, and injecting inside the proposed differential equation, it can be assumed without loss of generality that the boundary conditions are of the form y(a) = 0 and y(b) = 0.

From this point, the Sturm–Liouville theory comes in play; indeed, for a large class of functions f, f can be reconstructed from a set of orthonormal eigenfunction ui of the associated Liouville problem (it may be helpful to think about Fourier decomposition), with corresponding eigenvalue λi:

${\displaystyle f(x)=\sum _{i}\alpha _{i}u_{i}(x),\quad \alpha _{i}\in {\mathbb {R} }.}$

Then a solution to the proposed equation is simply:

${\displaystyle y=\sum _{i}{\frac {\alpha _{i}}{\lambda _{i}}}u_{i},}$

as can be checked immediately. Note, however, that the solution will be valid in the open interval (a,b) = {x | a < x < b} since it may fail in the boundaries. An example of this procedure is given in the next section.

## Example

We wish to find a function u(x) which solves the following Sturm–Liouville problem:

${\displaystyle (4)\qquad \qquad Lu=-{\frac {\mathrm {d} ^{2}u}{\mathrm {d} x^{2}}}=\lambda u}$

where the unknowns are λ and u(x). As above, we must add boundary conditions, we take for example

${\displaystyle u(0)=u(\pi )=0.}$

Observe that if k is any integer, then the function

${\displaystyle u(x)=\sin kx}$

is a solution with eigenvalue λ = k2. We know that the solutions of a S-L problem form an orthogonal basis, and we know from Fourier series that this set of sinusoidal functions is an orthogonal basis. Since orthogonal bases are always maximal (by definition) we conclude that the S-L problem in this case has no other eigenvectors.

Given the preceding, let us now solve the inhomogeneous problem

${\displaystyle Lu=x,\qquad x\in (0,\pi )}$

with the same boundary conditions. In this case, we must write f(x) = x in a Fourier series. The reader may check, either by integrating eikxx dx or by consulting a table of Fourier transforms, that we thus obtain

${\displaystyle Lu=\sum _{k=1}^{\infty }-2{\frac {\left(-1\right)^{k}}{k}}\sin kx.}$

This particular Fourier series is troublesome because of its poor convergence properties. It is not clear a priori whether the series converges pointwise. Because of Fourier analysis, since the Fourier coefficients are "square-summable", the Fourier series converges in L2 which is all we need for this particular theory to function. We mention for the interested reader that in this case we may rely on a result which says that Fourier series converge at every point of differentiability, and at jump points (the function x, considered as a periodic function, has a jump at π) converges to the average of the left and right limits (see convergence of Fourier series).

Therefore, by using formula (4), we obtain that the solution is

${\displaystyle u=\sum _{k=1}^{\infty }2{\frac {(-1)^{k}}{k^{3}}}\sin kx.}$

In this case, we could have found the answer using antidifferentiation. This technique yields

${\displaystyle u={\tfrac {1}{6}}(x^{3}-\pi ^{2}x),}$

whose Fourier series agrees with the solution we found. The antidifferentiation technique is no longer useful in most cases when the differential equation is in many variables.

## Application to normal modes

Certain partial differential equations can be solved with the help of S-L theory. Suppose we are interested in the vibrational modes of a thin membrane, held in a rectangular frame, 0 ≤ xL1, 0 ≤ yL2. The equation of motion for the vertical membrane's displacement, W(x,y,t) is given by the wave equation:

${\displaystyle {\frac {\partial ^{2}W}{\partial x^{2}}}+{\frac {\partial ^{2}W}{\partial y^{2}}}={\frac {1}{c^{2}}}{\frac {\partial ^{2}W}{\partial t^{2}}}.}$

The method of separation of variables suggests looking first for solutions of the simple form W = X(x) × Y(y) × T(t). For such a function W the partial differential equation becomes X/X + Y/Y = 1/c2 T/T. Since the three terms of this equation are functions of x, y, t separately, they must be constants. For example, the first term gives X″ = λX for a constant λ. The boundary conditions ("held in a rectangular frame") are W = 0 when x = 0, L1 or y = 0, L2 and define the simplest possible S-L eigenvalue problems as in the example, yielding the "normal mode solutions" for W with harmonic time dependence,

${\displaystyle W_{mn}(x,y,t)=A_{mn}\sin \left({\frac {m\pi x}{L_{1}}}\right)\sin \left({\frac {n\pi y}{L_{2}}}\right)\cos \left(\omega _{mn}t\right)}$

where m and n are non-zero integers, Amn are arbitrary constants, and

${\displaystyle \omega _{mn}^{2}=c^{2}\left({\frac {m^{2}\pi ^{2}}{L_{1}^{2}}}+{\frac {n^{2}\pi ^{2}}{L_{2}^{2}}}\right).}$

The functions Wmn form a basis for the Hilbert space of (generalized) solutions of the wave equation; that is, an arbitrary solution W can be decomposed into a sum of these modes, which vibrate at their individual frequencies ωmn. This representation may require a convergent infinite sum.

## Representation of solutions and numerical calculation

The Sturm–Liouville differential equation (1) with boundary conditions may be solved analytically, which can be exact or provide an approximation, by the Rayleigh–Ritz method, or by the matrix-variational method of Gerck et al.[1][2][3]

Numerically, a variety of methods are also available. In difficult cases, one may need to carry out the intermediate calculations to several hundred decimal places of accuracy in order to obtain the eigenvalues correctly to a few decimal places.

1. Shooting methods.[4][5] These methods proceed by guessing a value of λ, solving an initial value problem defined by the boundary conditions at one endpoint, say, a, of the interval [a,b], comparing the value this solution takes at the other endpoint b with the other desired boundary condition, and finally increasing or decreasing λ as necessary to correct the original value. This strategy is not applicable for locating complex eigenvalues.
2. Finite difference method.
3. The spectral parameter power series (SPPS) method[6] makes use of a generalization of the following fact about second-order ordinary differential equations: if y is a solution that does not vanish at any point of [a,b], then the function
${\displaystyle y(x)\int _{a}^{x}{\frac {\mathrm {d} t}{p(t)y(t)^{2}}}}$
is a solution of the same equation and is linearly independent from y. Further, all solutions are linear combinations of these two solutions. In the SPPS algorithm, one must begin with an arbitrary value λ
0
(often λ
0
= 0
; it does not need to be an eigenvalue) and any solution y0 of (1) with λ = λ
0
which does not vanish on [a,b]. (Discussion below of ways to find appropriate y0 and λ
0
.) Two sequences of functions X(n)(t), (n)(t) on [a,b], referred to as iterated integrals, are defined recursively as follows. First when n = 0, they are taken to be identically equal to 1 on [a,b]. To obtain the next functions they are multiplied alternately by 1/py2
0
and wy2
0
and integrated, specifically, for n > 0: ${\displaystyle (5)\qquad \qquad X^{(n)}(t)={\begin{cases}\displaystyle -\int _{a}^{x}X^{(n-1)}(t)p(t)^{-1}y_{0}(t)^{-2}\,\mathrm {d} t&n{\text{ odd}},\\[6pt]\displaystyle \quad \int _{a}^{x}X^{(n-1)}(t)y_{0}(t)^{2}w(t)\,\mathrm {d} t&n{\text{ even}}\end{cases}}}$
${\displaystyle (6)\qquad \qquad {\tilde {X}}^{(n)}(t)={\begin{cases}\displaystyle \quad \int _{a}^{x}{\tilde {X}}^{(n-1)}(t)y_{0}(t)^{2}w(t)\,\mathrm {d} t&n{\text{ odd}},\\[6pt]\displaystyle -\int _{a}^{x}{\tilde {X}}^{(n-1)}(t)p(t)^{-1}y_{0}(t)^{-2}\,\mathrm {d} t&n{\text{ even.}}\end{cases}}}$
The resulting iterated integrals are now applied as coefficients in the following two power series in λ:
${\displaystyle u_{0}=y_{0}\sum _{k=0}^{\infty }\left(\lambda -\lambda _{0}^{*}\right)^{k}{\tilde {X}}^{(2k)},}$
${\displaystyle u_{1}=y_{0}\sum _{k=0}^{\infty }\left(\lambda -\lambda _{0}^{*}\right)^{k}X^{(2k+1)}.}$
Then for any λ (real or complex), u0 and u1 are linearly independent solutions of the corresponding equation (1). (The functions p(x) and q(x) take part in this construction through their influence on the choice of y0.)
Next one chooses coefficients c0 and c1 so that the combination y = c0u0 + c1u1 satisfies the first boundary condition (2). This is simple to do since X(n)(a) = 0 and (n)(a) = 0, for n > 0. The values of X(n)(b) and (n)(b) provide the values of u0(b) and u1(b) and the derivatives u0(b) and u0(b), so the second boundary condition (3) becomes an equation in a power series in λ. For numerical work one may truncate this series to a finite number of terms, producing a calculable polynomial in λ whose roots are approximations of the sought-after eigenvalues.
When λ = λ0, this reduces to the original construction described above for a solution linearly independent to a given one. The representations (5) and (6) also have theoretical applications in Sturm–Liouville theory.[6]

### Construction of a nonvanishing solution

The SPPS method can, itself, be used to find a starting solution y0. Consider the equation (py′)′ = μqy; i.e., q, w, and λ are replaced in (1) by 0, q, and μ respectively. Then the constant function 1 is a nonvanishing solution corresponding to the eigenvalue μ0 = 0. While there is no guarantee that u0 or u1 will not vanish, the complex function y0 = u0 + iu1 will never vanish because two linearly-independent solutions of a regular S-L equation cannot vanish simultaneously as a consequence of the Sturm separation theorem. This trick gives a solution y0 of (1) for the value λ0 = 0. In practice if (1) has real coefficients, the solutions based on y0 will have very small imaginary parts which must be discarded.

## Application to partial differential equations

For a linear second-order in one spatial dimension and first-order in time of the form:

${\displaystyle f(x){\frac {\partial ^{2}u}{\partial x^{2}}}+g(x){\frac {\partial u}{\partial x}}+h(x)u={\frac {\partial u}{\partial t}}+k(t)u,}$
${\displaystyle u(a,t)=u(b,t)=0,\qquad u(x,0)=s(x).}$

Separting variables, we assume that

${\displaystyle u(x,t)=X(x)T(t).}$

Then our above partial differential equation may be written as:

${\displaystyle {\frac {{\hat {L}}X(x)}{X(x)}}={\frac {{\hat {M}}T(t)}{T(t)}}}$

where

${\displaystyle {\hat {L}}=f(x){\frac {\mathrm {d} ^{2}}{\mathrm {d} x^{2}}}+g(x){\frac {\mathrm {d} }{\mathrm {d} x}}+h(x),\qquad {\hat {M}}={\frac {\mathrm {d} }{\mathrm {d} t}}+k(t).}$

Since, by definition, and X(x) are independent of time t and and T(t) are independent of position x, then both sides of the above equation must be equal to a constant:

${\displaystyle {\hat {L}}X(x)=\lambda X(x),\qquad X(a)=X(b)=0,\qquad {\hat {M}}T(t)=\lambda T(t).}$

The first of these equations must be solved as a Sturm–Liouville problem in terms of the eigenfunctions Xn(x) and eigenvalues λn. The second of these equations can be analytically solved once the eigenvalues are known.

${\displaystyle {\frac {\mathrm {d} }{\mathrm {d} t}}T_{n}(t)={\bigl (}\lambda _{n}-k(t){\bigr )}T_{n}(t)}$
${\displaystyle T_{n}(t)=a_{n}\exp \left(\lambda _{n}t-\int _{0}^{t}k(\tau )\,\mathrm {d} \tau \right)}$
${\displaystyle u(x,t)=\sum _{n}a_{n}X_{n}(x)\exp \left(\lambda _{n}t-\int _{0}^{t}k(\tau )\,\mathrm {d} \tau \right)}$
${\displaystyle a_{n}={\frac {{\bigl \langle }X_{n}(x),s(x){\bigr \rangle }}{{\bigl \langle }X_{n}(x),X_{n}(x){\bigr \rangle }}}}$

where

${\displaystyle {\bigl \langle }y(x),z(x){\bigr \rangle }=\int _{a}^{b}y(x)z(x)w(x)\,\mathrm {d} x,}$
${\displaystyle w(x)={\frac {\exp \left(\int {\frac {g(x)}{f(x)}}\,\mathrm {d} x\right)}{f(x)}}.}$