# Poisson binomial distribution

In probability theory and statistics, the Poisson binomial distribution is the discrete probability distribution of a sum of independent Bernoulli trials that are not necessarily identically distributed. The concept is named after Siméon Denis Poisson.

Parameters ${\displaystyle \mathbf {p} \in [0,1]^{n}}$ — success probabilities for each of the n trials k ∈ { 0, …, n } ${\displaystyle \sum \limits _{A\in F_{k}}\prod \limits _{i\in A}p_{i}\prod \limits _{j\in A^{c}}(1-p_{j})}$ ${\displaystyle \sum \limits _{l=0}^{k}\sum \limits _{A\in F_{l}}\prod \limits _{i\in A}p_{i}\prod \limits _{j\in A^{c}}(1-p_{j})}$ ${\displaystyle \sum \limits _{i=1}^{n}p_{i}}$ ${\displaystyle \sigma ^{2}=\sum \limits _{i=1}^{n}(1-p_{i})p_{i}}$ ${\displaystyle {\frac {1}{\sigma ^{3}}}\sum \limits _{i=1}^{n}(1-2p_{i})(1-p_{i})p_{i}}$ ${\displaystyle {\frac {1}{\sigma ^{4}}}\sum \limits _{i=1}^{n}(1-6(1-p_{i})p_{i})(1-p_{i})p_{i}}$ ${\displaystyle \prod \limits _{j=1}^{n}(1-p_{j}+p_{j}e^{t})}$ ${\displaystyle \prod \limits _{j=1}^{n}(1-p_{j}+p_{j}e^{it})}$

In other words, it is the probability distribution of the number of successes in a sequence of n independent yes/no experiments with success probabilities ${\displaystyle p_{1},p_{2},\dots ,p_{n}}$. The ordinary binomial distribution is a special case of the Poisson binomial distribution, when all success probabilities are the same, that is ${\displaystyle p_{1}=p_{2}=\cdots =p_{n}}$.

## Mean and variance

Since a Poisson binomial distributed variable is a sum of n independent Bernoulli distributed variables, its mean and variance will simply be sums of the mean and variance of the n Bernoulli distributions:

${\displaystyle \mu =\sum \limits _{i=1}^{n}p_{i}}$
${\displaystyle \sigma ^{2}=\sum \limits _{i=1}^{n}(1-p_{i})p_{i}}$

For fixed values of the mean (${\displaystyle \mu }$) and size (n), the variance is maximal when all success probabilities are equal and we have a binomial distribution. When the mean is fixed, the variance is bounded from above by the variance of the Poisson distribution with the same mean which is attained asymptotically as n tends to infinity.

## Probability mass function

The probability of having k successful trials out of a total of n can be written as the sum [1]

${\displaystyle \Pr(K=k)=\sum \limits _{A\in F_{k}}\prod \limits _{i\in A}p_{i}\prod \limits _{j\in A^{c}}(1-p_{j})}$

where ${\displaystyle F_{k}}$ is the set of all subsets of k integers that can be selected from {1,2,3,...,n}. For example, if n = 3, then ${\displaystyle F_{2}=\left\{\{1,2\},\{1,3\},\{2,3\}\right\}}$. ${\displaystyle A^{c}}$ is the complement of ${\displaystyle A}$, i.e. ${\displaystyle A^{c}=\{1,2,3,\dots ,n\}\setminus A}$.

${\displaystyle F_{k}}$ will contain ${\displaystyle n!/((n-k)!k!)}$ elements, the sum over which is infeasible to compute in practice unless the number of trials n is small (e.g. if n = 30, ${\displaystyle F_{15}}$ contains over 1020 elements). However, there are other, more efficient ways to calculate ${\displaystyle \Pr(K=k)}$.

As long as none of the success probabilities are equal to one, one can calculate the probability of k successes using the recursive formula [2] [3]

${\displaystyle \Pr(K=k)={\begin{cases}\prod \limits _{i=1}^{n}(1-p_{i})&k=0\\{\frac {1}{k}}\sum \limits _{i=1}^{k}(-1)^{i-1}\Pr(K=k-i)T(i)&k>0\\\end{cases}}}$

where

${\displaystyle T(i)=\sum \limits _{j=1}^{n}\left({\frac {p_{j}}{1-p_{j}}}\right)^{i}.}$

The recursive formula is not numerically stable, and should be avoided if ${\displaystyle n}$ is greater than approximately 20. Another possibility is using the discrete Fourier transform.[4]

${\displaystyle \Pr(K=k)={\frac {1}{n+1}}\sum \limits _{l=0}^{n}C^{-lk}\prod \limits _{m=1}^{n}\left(1+(C^{l}-1)p_{m}\right)}$

where ${\displaystyle C=\exp \left({\frac {2i\pi }{n+1}}\right)}$ and ${\displaystyle i={\sqrt {-1}}}$.

Still other methods are described in [5] .

## Entropy

There is no simple formula for the entropy of a Poisson binomial distribution, but the entropy is bounded above by the entropy of a binomial distribution with the same number parameter and the same mean. Therefore, the entropy is also bounded above by the entropy of a Poisson distribution with the same mean.[6]

The Shepp–Olkin concavity conjecture, due to Lawrence Shepp and Ingram Olkin in 1981, states that the entropy of a Poisson binomial distribution is a concave function of the success probabilities ${\displaystyle p_{1},p_{2},\dots ,p_{n}}$.[7] This conjecture was proved by Erwan Hillion and Oliver Johnson in 2015.[8] The Shepp-Olkin monotonicity conjecture, also from the same 1981 paper, is that the entropy is monotone increasing in ${\displaystyle p_{i}}$, if all ${\displaystyle p_{i}\leq 1/2}$. This conjecture was also proved by Hillion and Johnson, in 2019 [9]

## Chernoff bound

The probability that a Poisson binomial distribution gets large, can be bounded using its moment generating function:

{\displaystyle {\begin{aligned}\Pr[S\geq s]&\leq \exp(-st)\operatorname {E} \left[\exp \left[t\sum _{i}X_{i}\right]\right]\\&=\exp(-st)\prod _{i}(1-p_{i}+e^{t}p_{i})\\&=\exp \left(-st+\sum _{i}\log \left(p_{i}(e^{t}-1)+1\right)\right)\\&\leq \exp \left(-st+\sum _{i}\log \left(\exp(p_{i}(e^{t}-1))\right)\right)\\&=\exp \left(-st+\sum _{i}p_{i}(e^{t}-1)\right)\\&=\exp \left(s-\mu -s\log {\frac {s}{\mu }}\right),\end{aligned}}}

where we took ${\textstyle t=\log \left(s\left/\sum _{i}p_{i}\right.\right)}$. This is similar to the tail bounds of a binomial distribution.

## References

1. Wang, Y. H. (1993). "On the number of successes in independent trials" (PDF). Statistica Sinica. 3 (2): 295–312.
2. Shah, B. K. (1994). "On the distribution of the sum of independent integer valued random variables". American Statistician. 27 (3): 123–124. JSTOR 2683639.
3. Chen, X. H.; A. P. Dempster; J. S. Liu (1994). "Weighted finite population sampling to maximize entropy" (PDF). Biometrika. 81 (3): 457. doi:10.1093/biomet/81.3.457.
4. Fernandez, M.; S. Williams (2010). "Closed-Form Expression for the Poisson-Binomial Probability Density Function". IEEE Transactions on Aerospace and Electronic Systems. 46 (2): 803–817. Bibcode:2010ITAES..46..803F. doi:10.1109/TAES.2010.5461658.
5. Chen, S. X.; J. S. Liu (1997). "Statistical Applications of the Poisson-Binomial and conditional Bernoulli distributions". Statistica Sinica. 7: 875–892.
6. Harremoës, P. (2001). "Binomial and Poisson distributions as maximum entropy distributions" (PDF). IEEE Transactions on Information Theory. 47 (5): 2039–2041. doi:10.1109/18.930936.
7. Shepp, Lawrence; Olkin, Ingram (1981). "Entropy of the sum of independent Bernoulli random variables and of the multinomial distribution". In Gani, J.; Rohatgi, V.K. (eds.). Contributions to probability: A collection of papers dedicated to Eugene Lukacs. New York: Academic Press. pp. 201–206. ISBN 0-12-274460-8. MR 0618689.
8. Hillion, Erwan; Johnson, Oliver (2015-03-05). "A proof of the Shepp-Olkin entropy concavity conjecture". Bernoulli. 23: 3638–3649. arXiv:1503.01570. doi:10.3150/16-BEJ860.
9. Hillion, Erwan; Johnson, Oliver (2019-11-09). "A proof of the Shepp-Olkin entropy monotonicity conjecture". Electronic Journal of Probability. 24 (126): 1–14. doi:10.1214/19-EJP380.