# Dual norm

In functional analysis, the dual norm is a measure of the "size" of each continuous linear functional defined on a normed vector space.

## Definition

Let ${\displaystyle X}$ be a normed vector space with norm ${\displaystyle |\cdot |}$ and let ${\displaystyle X^{*}}$ be the dual space. The dual norm of a continuous linear functional ${\displaystyle f}$ belonging to ${\displaystyle X^{*}}$ is defined to be the real number

${\displaystyle \|f\|:=\sup\{|f(x)|:x\in X,|x|\leq 1\}}$

where ${\displaystyle \sup }$ denotes the supremum.[1]

The map ${\displaystyle f\mapsto \|f\|}$ defines a norm on ${\displaystyle X^{*}}$. (See Theorems 1 and 2 below.)

The dual norm is a special case of the operator norm defined for each (bounded) linear map between normed vector spaces.

The topology on ${\displaystyle X^{*}}$ induced by ${\displaystyle |\cdot |}$ turns out to be as strong as the weak-* topology on ${\displaystyle X^{*}}$.

If the ground field of ${\displaystyle X}$ is complete then ${\displaystyle X^{*}}$ is a Banach space.

## The double dual of a normed linear space

The double dual (or second dual) ${\displaystyle X^{**}}$ of ${\displaystyle X}$ is the dual of the normed vector space ${\displaystyle X^{*}}$. There is a natural map ${\displaystyle \varphi :X\to X^{**}}$. Indeed, for each ${\displaystyle w^{*}}$ in ${\displaystyle X^{*}}$ define

${\displaystyle \varphi (v)(w^{*}):=w^{*}(v).}$

The map ${\displaystyle \varphi }$ is linear, injective, and distance preserving.[2] In particular, if ${\displaystyle X}$ is complete (i.e. a Banach space), then ${\displaystyle \varphi }$ is an isometry onto a closed subspace of ${\displaystyle X^{**}}$.[3]

In general, the map ${\displaystyle \varphi }$ is not surjective. For example, if ${\displaystyle X}$ is the Banach space ${\displaystyle L^{\infty }}$ consisting of bounded functions on the real line with the supremum norm, then the map ${\displaystyle \varphi }$ is not surjective. (See ${\displaystyle L^{p}}$ space). If ${\displaystyle \varphi }$ is surjective, then ${\displaystyle X}$ is said to be a reflexive Banach space. If ${\displaystyle 1 then the space ${\displaystyle L^{p}}$ is a reflexive Banach space.

## Mathematical Optimization

Let ${\displaystyle \|\cdot \|}$ be a norm on ${\displaystyle \mathbb {R} ^{n}.}$ The associated dual norm, denoted ${\displaystyle \|\cdot \|_{*},}$ is defined as

${\displaystyle \|z\|_{*}=\sup\{z^{\intercal }x\;|\;\|x\|\leq 1\}.}$

(This can be shown to be a norm.) The dual norm can be interpreted as the operator norm of ${\displaystyle z^{\intercal }}$, interpreted as a ${\displaystyle 1\times n}$ matrix, with the norm ${\displaystyle \|\cdot \|}$ on ${\displaystyle \mathbb {R} ^{n}}$, and the absolute value on ${\displaystyle \mathbb {R} }$:

${\displaystyle \|z\|_{*}=\sup\{|z^{\intercal }x|\;|\;\|x\|\leq 1\}.}$

From the definition of dual norm we have the inequality

${\displaystyle z^{\intercal }x=\|x\|\left(z^{\intercal }{\frac {x}{\|x\|}}\right)\leq \|x\|\|z\|_{*}}$

which holds for all x and z.[4] The dual of the dual norm is the original norm: we have ${\displaystyle \|x\|_{**}=\|x\|}$ for all x. (This need not hold in infinite-dimensional vector spaces.)

The dual of the Euclidean norm is the Euclidean norm, since

${\displaystyle \sup\{z^{\intercal }x\;|\;\|x\|_{2}\leq 1\}=\|z\|_{2}.}$

(This follows from the Cauchy–Schwarz inequality; for nonzero z, the value of x that maximises ${\displaystyle z^{\intercal }x}$ over ${\displaystyle \|x\|_{2}\leq 1}$ is ${\displaystyle {\tfrac {z}{\|z\|_{2}}}}$.)

The dual of the ${\displaystyle \ell _{1}}$-norm is the ${\displaystyle \ell _{\infty }}$-norm:

${\displaystyle \sup\{z^{\intercal }x\;|\;\|x\|_{\infty }\leq 1\}=\sum _{i=1}^{n}|z_{i}|=\|z\|_{1},}$

and the dual of the ${\displaystyle \ell _{\infty }}$-norm is the ${\displaystyle \ell _{1}}$-norm.

More generally, Hölder's inequality shows that the dual of the ${\displaystyle \ell _{p}}$-norm is the ${\displaystyle \ell _{q}}$-norm, where, q satisfies ${\displaystyle {\tfrac {1}{p}}+{\tfrac {1}{q}}=1}$, i.e., ${\displaystyle q={\tfrac {p}{p-1}}.}$

As another example, consider the ${\displaystyle \ell _{2}}$- or spectral norm on ${\displaystyle \mathbb {R} ^{m\times n}}$. The associated dual norm is

${\displaystyle \|Z\|_{2*}=\sup\{\mathrm {\bf {tr}} (Z^{\intercal }X)|\|X\|_{2}\leq 1\},}$

which turns out to be the sum of the singular values,

${\displaystyle \|Z\|_{2*}=\sigma _{1}(Z)+\cdots +\sigma _{r}(Z)=\mathrm {\bf {tr}} ({\sqrt {Z^{\intercal }Z}}),}$

where ${\displaystyle r=\mathrm {\bf {rank}} Z.}$ This norm is sometimes called the nuclear norm.[5]

## Examples

### Dual norm for matrices

The Frobenius norm defined by

${\displaystyle \|A\|_{\text{F}}={\sqrt {\sum _{i=1}^{m}\sum _{j=1}^{n}\left|a_{ij}\right|^{2}}}={\sqrt {\operatorname {trace} (A^{*}A)}}={\sqrt {\sum _{i=1}^{\min\{m,n\}}\sigma _{i}^{2}}}}$

is self-dual, i.e., its dual norm is ${\displaystyle \|\cdot \|'_{\text{F}}=\|\cdot \|_{\text{F}}.}$

The spectral norm, a special case of the induced norm when ${\displaystyle p=2}$, is defined by the maximum singular values of a matrix, i.e.,

${\displaystyle \|A\|_{2}=\sigma _{\max }(A),}$

has the nuclear norm as its dual norm, which is defined by

${\displaystyle \|B\|'_{2}=\sum _{i}\sigma _{i}(B),}$

for any matrix ${\displaystyle B}$ where ${\displaystyle \sigma _{i}(B)}$ denote the singular values.

## Some basic results about the operator norm

More generally, let ${\displaystyle X}$ and ${\displaystyle Y}$ be topological vector spaces, and ${\displaystyle L(X,Y)}$[6] be the collection of all bounded linear mappings (or operators) of ${\displaystyle X}$ into ${\displaystyle Y}$. In the case where ${\displaystyle X}$ and ${\displaystyle Y}$ are normed vector spaces, ${\displaystyle L(X,Y)}$ can be normed in a natural way.

Theorem 1. Let ${\displaystyle X}$ and ${\displaystyle Y}$ be normed spaces, and associate to each ${\displaystyle f\in L(X,Y)}$ the number:
${\displaystyle \|f\|=\sup\{|f(x)|:x\in X,\|x\|\leq 1\}.}$
This turns ${\displaystyle L(X,Y)}$ into a normed space. Moreover if ${\displaystyle Y}$ is a Banach space, so is ${\displaystyle L(X,Y)}$.[7]

Proof. A subset of a normed space is bounded if and only if it lies in some multiple of the unit sphere; thus ${\displaystyle \|f\|<\infty }$ for every ${\displaystyle f\in L(X,Y)}$ if ${\displaystyle \alpha }$ is a scalar, then ${\displaystyle (\alpha f)(x)=\alpha \cdot fx}$ so that

${\displaystyle \|\alpha f\|=|\alpha |\|f\|}$

The triangle inequality in ${\displaystyle Y}$ shows that

{\displaystyle {\begin{aligned}\|(f_{1}+f_{2})x\|&=\|f_{1}x+f_{2}x\|\\&\leq \|f_{1}x\|+\|f_{2}x\|\\&\leq (\|f_{1}\|+\|f_{2}\|)\|x\|\\&\leq \|f_{1}\|+\|f_{2}\|\end{aligned}}}

for every ${\displaystyle x\in X}$ with ${\displaystyle \|x\|\leq 1}$. Thus

${\displaystyle \|f_{1}+f_{2}\|\leq \|f_{1}\|+\|f_{2}\|}$

If ${\displaystyle f\neq 0}$, then ${\displaystyle fx\neq 0}$ for some ${\displaystyle x\in X}$; hence ${\displaystyle \|f\|>0}$. Thus, ${\displaystyle L(X,Y)}$ is a normed space.[8]

Assume now that ${\displaystyle Y}$ is complete, and that ${\displaystyle \{f_{n}\}}$ is a Cauchy sequence in ${\displaystyle L(X,Y)}$. Since

${\displaystyle \|f_{n}x-f_{m}x\|\leq \|f_{n}-f_{m}\|\|x\|}$

and it is assumed that ${\displaystyle \|f_{n}-f_{m}\|\to 0}$ as ${\displaystyle n,m\to \infty }$, ${\displaystyle \{f_{n}x\}}$ is a Cauchy sequence in ${\displaystyle Y}$ for every ${\displaystyle x\in X}$. Hence

${\displaystyle fx=\lim _{n\to \infty }f_{n}x}$

exists. It is clear that ${\displaystyle f:X\to Y}$ is linear. If ${\displaystyle \varepsilon >0}$, ${\displaystyle \|f_{n}-f_{m}\|\|x\|\leq \varepsilon \|x\|}$ for sufficiently large n and m. It follows

${\displaystyle \|fx-f_{m}x\|\leq \varepsilon \|x\|}$

for sufficiently large m. Hence ${\displaystyle \|fx\|\leq (\|f_{m}\|+\varepsilon )\|x\|}$, so that ${\displaystyle f\in L(X,Y)}$ and ${\displaystyle \|f-f_{m}\|\leq \varepsilon }$. Thus ${\displaystyle f_{m}\to f}$ in the norm of ${\displaystyle L(X,Y)}$. This establishes the completeness of ${\displaystyle L(X,Y).}$[9]

When ${\displaystyle Y}$ is a scalar field (i.e. ${\displaystyle Y=\mathbb {C} }$ or ${\displaystyle Y=\mathbb {R} }$) so that ${\displaystyle L(X,Y)}$ is the dual space ${\displaystyle X^{*}}$ of ${\displaystyle X}$.

Theorem 2. Suppose ${\displaystyle B}$ is the closed unit ball of normed space ${\displaystyle X}$. For every ${\displaystyle x^{*}\in X^{*}}$ define:
${\displaystyle \|x^{*}\|=\sup\{|\langle {x,x^{*}}\rangle |:x\in B\}}$
Then
(a) This norm makes ${\displaystyle X^{*}}$ into a Banach space.[10]
(b) Let ${\displaystyle B^{*}}$ be the closed unit ball of ${\displaystyle X^{*}}$. For every ${\displaystyle x\in X}$,
${\displaystyle \|x\|=\sup\{|\langle {x,x^{*}}\rangle |:x^{*}\in B^{*}\}.}$
Consequently, ${\displaystyle x^{*}\to \langle {x,x^{*}}\rangle }$ is a bounded linear functional on ${\displaystyle X^{*}}$ of norm ${\displaystyle \|x\|}$.
(c) ${\displaystyle B^{*}}$ is weak*-compact.

Proof. Since ${\displaystyle L(X,Y)=X^{*}}$, when ${\displaystyle Y}$ is the scalar field, (a) is a corollary of Theorem 1. Fix ${\displaystyle x\in X}$. There exists[11] ${\displaystyle y^{*}\in B^{*}}$ such that

${\displaystyle \langle {x,y^{*}}\rangle =\|x\|.}$

but,

${\displaystyle |\langle {x,x^{*}}\rangle |\leq \|x\|\|x^{*}\|\leq \|x\|}$

for every ${\displaystyle x^{*}\in B^{*}}$. (b) follows from the above. Since the open unit ball ${\displaystyle U}$ of ${\displaystyle X}$ is dense in ${\displaystyle B}$, the definition of ${\displaystyle \|x^{*}\|}$ shows that ${\displaystyle x^{*}\in B^{*}}$ if and only if ${\displaystyle |\langle {x,x^{*}}\rangle |\leq 1}$ for every ${\displaystyle x\in U}$. The proof for (c)[12] now follows directly.[13]

## Notes

1. Rudin 1991, p. 87
2. Rudin 1991, section 4.5, p. 95
3. Rudin 1991, p. 95
4. This inequality is tight, in the following sense: for any x there is a z for which the inequality holds with equality. (Similarly, for any z there is an x that gives equality.)
5. Each ${\displaystyle L(X,Y)}$ is a vector space, with the usual definitions of addition and scalar multiplication of functions; this only depends on the vector space structure of ${\displaystyle Y}$, not ${\displaystyle X}$.
6. Rudin 1991, p. 92
7. Rudin 1991, p. 93
8. Rudin 1991, p. 93
9. Aliprantis 2005, p. 230
10. Rudin 1991, Theorem 3.3 Corollary, p. 59
11. Rudin 1991, Theorem 3.15 The Banach–Alaoglu theorem algorithm, p. 68
12. Rudin 1991, p. 94

## References

• Aliprantis, Charalambos D.; Border, Kim C. (2007). Infinite Dimensional Analysis: A Hitchhiker's Guide (3rd ed.). Springer. ISBN 9783540326960.
• Boyd, Stephen; Vandenberghe, Lieven (2004). Convex Optimization. Cambridge University Press. ISBN 9780521833783.
• Kolmogorov, A.N.; Fomin, S.V. (1957). Elements of the Theory of Functions and Functional Analysis, Volume 1: Metric and Normed Spaces. Rochester: Graylock Press.
• Rudin, Walter (1991), Functional analysis, McGraw-Hill Science, ISBN 978-0-07-054236-5.