# Galerkin method

In mathematics, in the area of numerical analysis, Galerkin methods are a class of methods for converting a continuous operator problem (such as a differential equation) to a discrete problem. In principle, it is the equivalent of applying the method of variation of parameters to a function space, by converting the equation to a weak formulation. Typically one then applies some constraints on the function space to characterize the space with a finite set of basis functions.

The approach is usually credited to Boris Galerkin but the method was discovered by Walther Ritz,[1] to whom Galerkin refers. Often when referring to a Galerkin method, one also gives the name along with typical approximation methods used, such as Bubnov–Galerkin method (after Ivan Bubnov), Petrov–Galerkin method (after Georgii I. Petrov[2][3]) or Ritz–Galerkin method[4] (after Walther Ritz).

Examples of Galerkin methods are:

## Introduction with an abstract problem

### A problem in weak formulation

Let us introduce Galerkin's method with an abstract problem posed as a weak formulation on a Hilbert space ${\displaystyle V}$, namely,

find ${\displaystyle u\in V}$ such that for all ${\displaystyle v\in V,a(u,v)=f(v)}$.

Here, ${\displaystyle a(\cdot ,\cdot )}$ is a bilinear form (the exact requirements on ${\displaystyle a(\cdot ,\cdot )}$ will be specified later) and ${\displaystyle f}$ is a bounded linear functional on ${\displaystyle V}$.

### Galerkin dimension reduction

Choose a subspace ${\displaystyle V_{n}\subset V}$ of dimension n and solve the projected problem:

Find ${\displaystyle u_{n}\in V_{n}}$ such that for all ${\displaystyle v_{n}\in V_{n},a(u_{n},v_{n})=f(v_{n})}$.

We call this the Galerkin equation. Notice that the equation has remained unchanged and only the spaces have changed. Reducing the problem to a finite-dimensional vector subspace allows us to numerically compute ${\displaystyle u_{n}}$ as a finite linear combination of the basis vectors in ${\displaystyle V_{n}}$.

### Galerkin orthogonality

The key property of the Galerkin approach is that the error is orthogonal to the chosen subspaces. Since ${\displaystyle V_{n}\subset V}$, we can use ${\displaystyle v_{n}}$ as a test vector in the original equation. Subtracting the two, we get the Galerkin orthogonality relation for the error, ${\displaystyle \epsilon _{n}=u-u_{n}}$ which is the error between the solution of the original problem, ${\displaystyle u}$, and the solution of the Galerkin equation, ${\displaystyle u_{n}}$

${\displaystyle a(\epsilon _{n},v_{n})=a(u,v_{n})-a(u_{n},v_{n})=f(v_{n})-f(v_{n})=0.}$

### Matrix form

Since the aim of Galerkin's method is the production of a linear system of equations, we build its matrix form, which can be used to compute the solution algorithmically.

Let ${\displaystyle e_{1},e_{2},\ldots ,e_{n}}$ be a basis for ${\displaystyle V_{n}}$. Then, it is sufficient to use these in turn for testing the Galerkin equation, i.e.: find ${\displaystyle u_{n}\in V_{n}}$ such that

${\displaystyle a(u_{n},e_{i})=f(e_{i})\quad i=1,\ldots ,n.}$

We expand ${\displaystyle u_{n}}$ with respect to this basis, ${\displaystyle u_{n}=\sum _{j=1}^{n}u_{j}e_{j}}$ and insert it into the equation above, to obtain

${\displaystyle a\left(\sum _{j=1}^{n}u_{j}e_{j},e_{i}\right)=\sum _{j=1}^{n}u_{j}a(e_{j},e_{i})=f(e_{i})\quad i=1,\ldots ,n.}$

This previous equation is actually a linear system of equations ${\displaystyle Au=f}$, where

${\displaystyle A_{ij}=a(e_{j},e_{i}),\quad f_{i}=f(e_{i}).}$

#### Symmetry of the matrix

Due to the definition of the matrix entries, the matrix of the Galerkin equation is symmetric if and only if the bilinear form ${\displaystyle a(\cdot ,\cdot )}$ is symmetric.

## Analysis of Galerkin methods

Here, we will restrict ourselves to symmetric bilinear forms, that is

${\displaystyle a(u,v)=a(v,u).}$

While this is not really a restriction of Galerkin methods, the application of the standard theory becomes much simpler. Furthermore, a Petrov–Galerkin method may be required in the nonsymmetric case.

The analysis of these methods proceeds in two steps. First, we will show that the Galerkin equation is a well-posed problem in the sense of Hadamard and therefore admits a unique solution. In the second step, we study the quality of approximation of the Galerkin solution ${\displaystyle u_{n}}$.

The analysis will mostly rest on two properties of the bilinear form, namely

• Boundedness: for all ${\displaystyle u,v\in V}$ holds
${\displaystyle a(u,v)\leq C\|u\|\,\|v\|}$ for some constant ${\displaystyle C>0}$
• Ellipticity: for all ${\displaystyle u\in V}$ holds
${\displaystyle a(u,u)\geq c\|u\|^{2}}$ for some constant ${\displaystyle c>0.}$

By the Lax-Milgram theorem (see weak formulation), these two conditions imply well-posedness of the original problem in weak formulation. All norms in the following sections will be norms for which the above inequalities hold (these norms are often called an energy norm).

### Well-posedness of the Galerkin equation

Since ${\displaystyle V_{n}\subset V}$, boundedness and ellipticity of the bilinear form apply to ${\displaystyle V_{n}}$. Therefore, the well-posedness of the Galerkin problem is actually inherited from the well-posedness of the original problem.

### Quasi-best approximation (Céa's lemma)

The error ${\displaystyle u-u_{n}}$ between the original and the Galerkin solution admits the estimate

${\displaystyle \|u-u_{n}\|\leq {\frac {C}{c}}\inf _{v_{n}\in V_{n}}\|u-v_{n}\|.}$

This means, that up to the constant ${\displaystyle C/c}$, the Galerkin solution ${\displaystyle u_{n}}$ is as close to the original solution ${\displaystyle u}$ as any other vector in ${\displaystyle V_{n}}$. In particular, it will be sufficient to study approximation by spaces ${\displaystyle V_{n}}$, completely forgetting about the equation being solved.

#### Proof

Since the proof is very simple and the basic principle behind all Galerkin methods, we include it here: by ellipticity and boundedness of the bilinear form (inequalities) and Galerkin orthogonality (equals sign in the middle), we have for arbitrary ${\displaystyle v_{n}\in V_{n}}$:

${\displaystyle c\|u-u_{n}\|^{2}\leq a(u-u_{n},u-u_{n})=a(u-u_{n},u-v_{n})\leq C\|u-u_{n}\|\,\|u-v_{n}\|.}$

Dividing by ${\displaystyle c\|u-u_{n}\|}$ and taking the infimum over all possible ${\displaystyle v_{n}}$ yields the lemma.