# Evidence lower bound

In statistics, the evidence lower bound (ELBO, also variational lower bound or negative variational free energy) is the quantity optimized in Variational Bayesian methods. These methods handle cases where a distribution ${\displaystyle Q}$ over unobserved variables ${\displaystyle \mathbf {Z} }$ is optimized as an approximation to the true posterior ${\displaystyle P(\mathbf {Z} |\mathbf {X} )}$, given observed data ${\displaystyle \mathbf {X} }$. Then the evidence lower bound is defined as [1]:

${\displaystyle L=H(Q)-H(Q;P(X,Z))=\sum _{\mathbf {Z} }Q(\mathbf {Z} )\log P(\mathbf {Z} ,\mathbf {X} )-\sum _{\mathbf {Z} }Q(\mathbf {Z} )\log Q(\mathbf {Z} )}$

where ${\displaystyle H(Q;P(X,Z))}$ is cross entropy. Maximizing the evidence lower bound minimizes ${\displaystyle D_{\mathrm {KL} }(Q\parallel P)}$, the Kullback–Leibler divergence a measure of dissimilarity of ${\displaystyle Q}$ from the true posterior. The primary reason why this quantity is preferred for optimization is that it can be computed without access to the posterior, given a good choice of ${\displaystyle Q}$.

For other measures of dissimilarity to be optimized to fit ${\displaystyle Q}$ see Divergence (statistics)[2].

## References

1. Yang, Xitong. "Understanding the Variational Lower Bound" (PDF). Institute for Advanced Computer Studies. University of Maryland. Retrieved 20 March 2018.
2. Minka, Thomas (2005), Divergence measures and message passing. (PDF)
3. Bishop, Christopher M. (2006), "10.1 Variational Inference", Pattern Recognition and Machine Learning (PDF)