Inclusion–exclusion principle
In combinatorics (combinatorial mathematics), the inclusion–exclusion principle is a counting technique which generalizes the familiar method of obtaining the number of elements in the union of two finite sets; symbolically expressed as
where A and B are two finite sets and S indicates the cardinality of a set S (which may be considered as the number of elements of the set, if the set is finite). The formula expresses the fact that the sum of the sizes of the two sets may be too large since some elements may be counted twice. The doublecounted elements are those in the intersection of the two sets and the count is corrected by subtracting the size of the intersection.
The principle is more clearly seen in the case of three sets, which for the sets A, B and C is given by
This formula can be verified by counting how many times each region in the Venn diagram figure is included in the righthand side of the formula. In this case, when removing the contributions of overcounted elements, the number of elements in the mutual intersection of the three sets has been subtracted too often, so must be added back in to get the correct total.
Generalizing the results of these examples gives the principle of inclusion–exclusion. To find the cardinality of the union of n sets:
 Include the cardinalities of the sets.
 Exclude the cardinalities of the pairwise intersections.
 Include the cardinalities of the triplewise intersections.
 Exclude the cardinalities of the quadruplewise intersections.
 Include the cardinalities of the quintuplewise intersections.
 Continue, until the cardinality of the ntuplewise intersection is included (if n is odd) or excluded (n even).
The name comes from the idea that the principle is based on overgenerous inclusion, followed by compensating exclusion. This concept is attributed to Abraham de Moivre (1718);[1] but it first appears in a paper of Daniel da Silva (1854),[2] and later in a paper by J. J. Sylvester (1883).[3] Sometimes the principle is referred to as the formula of Da Silva, or Sylvester due to these publications. The principle is an example of the sieve method extensively used in number theory and is sometimes referred to as the sieve formula,[4] though Legendre already used a similar device in a sieve context in 1808.
As finite probabilities are computed as counts relative to the cardinality of the probability space, the formulas for the principle of inclusion–exclusion remain valid when the cardinalities of the sets are replaced by finite probabilities. More generally, both versions of the principle can be put under the common umbrella of measure theory.
In a very abstract setting, the principle of inclusion–exclusion can be expressed as the calculation of the inverse of a certain matrix.[5] This inverse has a special structure, making the principle an extremely valuable technique in combinatorics and related areas of mathematics. As GianCarlo Rota put it:[6]
"One of the most useful principles of enumeration in discrete probability and combinatorial theory is the celebrated principle of inclusion–exclusion. When skillfully applied, this principle has yielded the solution to many a combinatorial problem."
Statement
In its general form, the principle of inclusion–exclusion states that for finite sets A_{1}, ..., A_{n}, one has the identity

(1)
This can be compactly written as
or
In words, to count the number of elements in a finite union of finite sets, first sum the cardinalities of the individual sets, then subtract the number of elements that appear in at least two sets, then add back the number of elements that appear in at least three sets, then subtract the number of elements that appear in at least four sets, and so on. This process always ends since there can be no elements that appear in more than the number of sets in the union. (For example, if there can be no elements that appear in more than sets; equivalently, there can be no elements that appear in at least sets.)
In applications it is common to see the principle expressed in its complementary form. That is, letting S be a finite universal set containing all of the A_{i} and letting denote the complement of A_{i} in S, by De Morgan's laws we have
As another variant of the statement, let P_{1}, ..., P_{n} be a list of properties that elements of a set S may or may not have, then the principle of inclusion–exclusion provides a way to calculate the number of elements of S which have none of the properties. Just let A_{i} be the subset of elements of S which have the property P_{i} and use the principle in its complementary form. This variant is due to J. J. Sylvester.[1]
Notice that if you take into account only the first m<n sums on the right (in the general form of the principle), then you will get an overestimate if m is odd and an underestimate if m is even.
Examples
Counting integers
As a simple example of the use of the principle of inclusion–exclusion, consider the question:[7]
 How many integers in {1,...,100} are not divisible by 2, 3 or 5?
Let S = {1,...,100} and P_{1} the property that an integer is divisible by 2, P_{2} the property that an integer is divisible by 3 and P_{3} the property that an integer is divisible by 5. Letting A_{i} be the subset of S whose elements have property P_{i} we have by elementary counting: A_{1} = 50, A_{2} = 33, and A_{3} = 20. There are 16 of these integers divisible by 6, 10 divisible by 10, and 6 divisible by 15. Finally, there are just 3 integers divisible by 30, so the number of integers not divisible by any of 2, 3 or 5 is given by:
 100 − (50 + 33 + 20) + (16 + 10 + 6) − 3 = 26.
Counting derangements
A more complex example is the following.
Suppose there is a deck of n cards numbered from 1 to n. Suppose a card numbered m is in the correct position if it is the mth card in the deck. How many ways, W, can the cards be shuffled with at least 1 card being in the correct position?
Begin by defining set A_{m}, which is all of the orderings of cards with the mth card correct. Then the number of orders, W, with at least one card being in the correct position, m, is
Apply the principle of inclusion–exclusion,
Each value represents the set of shuffles having at least p values m_{1}, ..., m_{p} in the correct position. Note that the number of shuffles with at least p values correct only depends on p, not on the particular values of . For example, the number of shuffles having the 1st, 3rd, and 17th cards in the correct position is the same as the number of shuffles having the 2nd, 5th, and 13th cards in the correct positions. It only matters that of the n cards, 3 were chosen to be in the correct position. Thus there are equal terms in the pth summation (see combination).
is the number of orderings having p elements in the correct position, which is equal to the number of ways of ordering the remaining n − p elements, or (n − p)!. Thus we finally get:
A permutation where no card is in the correct position is called a derangement. Taking n! to be the total number of permutations, the probability Q that a random shuffle produces a derangement is given by
a truncation to n + 1 terms of the Taylor expansion of e^{−1}. Thus the probability of guessing an order for a shuffled deck of cards and being incorrect about every card is approximately e^{−1} or 37%.
A special case
The situation that appears in the derangement example above occurs often enough to merit special attention.[8] Namely, when the size of the intersection sets appearing in the formulas for the principle of inclusion–exclusion depend only on the number of sets in the intersections and not on which sets appear. More formally, if the intersection
has the same cardinality, say α_{k} = A_{J}, for every kelement subset J of {1, ..., n}, then
Or, in the complementary form, where the universal set S has cardinality α_{0},
A generalization
Given a family (repeats allowed) of subsets A_{1}, A_{2}, ..., A_{n} of a universal set S, the principle of inclusion–exclusion calculates the number of elements of S in none of these subsets. A generalization of this concept would calculate the number of elements of S which appear in exactly some fixed m of these sets.
Let N = [n] = {1,2,...,n}. If we define , then the principle of inclusion–exclusion can be written as, using the notation of the previous section; the number of elements of S contained in none of the A_{i} is:
If I is a fixed subset of the index set N, then the number of elements which belong to A_{i} for all i in I and for no other values is:[9]
Define the sets
We seek the number of elements in none of the B_{k} which, by the principle of inclusion–exclusion (with ), is
The correspondence K ↔ J = I ∪ K between subsets of N \ I and subsets of N containing I is a bijection and if J and K correspond under this map then B_{K} = A_{J}, showing that the result is valid.
In probability
In probability, for events A_{1}, ..., A_{n} in a probability space , the inclusion–exclusion principle becomes for n = 2
for n = 3
and in general
which can be written in closed form as
where the last sum runs over all subsets I of the indices 1, ..., n which contain exactly k elements, and
denotes the intersection of all those A_{i} with index in I.
According to the Bonferroni inequalities, the sum of the first terms in the formula is alternately an upper bound and a lower bound for the LHS. This can be used in cases where the full formula is too cumbersome.
For a general measure space (S,Σ,μ) and measurable subsets A_{1}, ..., A_{n} of finite measure, the above identities also hold when the probability measure is replaced by the measure μ.
Special case
If, in the probabilistic version of the inclusion–exclusion principle, the probability of the intersection A_{I} only depends on the cardinality of I, meaning that for every k in {1, ..., n} there is an a_{k} such that
then the above formula simplifies to
due to the combinatorial interpretation of the binomial coefficient . For example, if the events are independent and identically distributed, then for all i, and we have , in which case the expression above simplifies to
(This result can also be derived more simply by considering the intersection of the complements of the events .)
An analogous simplification is possible in the case of a general measure space (S,Σ,μ) and measurable subsets A_{1}, ..., A_{n} of finite measure.
Other forms
The principle is sometimes stated in the form[10] that says that if
then
The combinatorial and the probabilistic version of the inclusion–exclusion principle are instances of (**).
Proof 

Take , , and respectively for all sets with . Then we obtain respectively for all sets with . This is because elements of can be contained in other ( with ) as well, and the formula runs exactly through all possible extensions of the sets with other , counting only for the set that matches the membership behavior of , if runs through all subsets of (as in the definition of ). Since , we obtain from (**) with that and by interchanging sides, the combinatorial and the probabilistic version of the inclusion–exclusion principle follow. 
If one sees a number as a set of its prime factors, then (**) is a generalization of Möbius inversion formula for squarefree natural numbers. Therefore, (**) is seen as the Möbius inversion formula for the incidence algebra of the partially ordered set of all subsets of A.
For a generalization of the full version of Möbius inversion formula, (**) must be generalized to multisets. For multisets instead of sets, (**) becomes
where is the multiset for which , and
 μ(S) = 1 if S is a set (i.e. a multiset without double elements) of even cardinality.
 μ(S) = −1 if S is a set (i.e. a multiset without double elements) of odd cardinality.
 μ(S) = 0 if S is a proper multiset (i.e. S has double elements).
Notice that is just the of (**) in case is a set.
Proof of (***) 

Substitute on the right hand side of (***). Notice that appears once on both sides of (***). So we must show that for all with , the terms cancel out on the right hand side of (***). For that purpose, take a fixed such that and take an arbitrary fixed such that . Notice that must be a set for each positive or negative appearance of on the right hand side of (***) that is obtained by way of the multiset such that . Now each appearance of on the right hand side of (***) that is obtained by way of such that is a set that contains cancels out with the one that is obtained by way of the corresponding such that is a set that does not contain . This gives the desired result. 
Applications
The inclusion–exclusion principle is widely used and only a few of its applications can be mentioned here.
Counting derangements
A wellknown application of the inclusion–exclusion principle is to the combinatorial problem of counting all derangements of a finite set. A derangement of a set A is a bijection from A into itself that has no fixed points. Via the inclusion–exclusion principle one can show that if the cardinality of A is n, then the number of derangements is [n! / e] where [x] denotes the nearest integer to x; a detailed proof is available here and also see the examples section above.
The first occurrence of the problem of counting the number of derangements is in an early book on games of chance: Essai d'analyse sur les jeux de hazard by P. R. de Montmort (1678 – 1719) and was known as either "Montmort's problem" or by the name he gave it, "problème des rencontres."[11] The problem is also known as the hatcheck problem.
The number of derangements is also known as the subfactorial of n, written !n. It follows that if all bijections are assigned the same probability then the probability that a random bijection is a derangement quickly approaches 1/e as n grows.
Counting intersections
The principle of inclusion–exclusion, combined with De Morgan's law, can be used to count the cardinality of the intersection of sets as well. Let represent the complement of A_{k} with respect to some universal set A such that for each k. Then we have
thereby turning the problem of finding an intersection into the problem of finding a union.
Graph coloring
The inclusion exclusion principle forms the basis of algorithms for a number of NPhard graph partitioning problems, such as graph coloring.[12]
A well known application of the principle is the construction of the chromatic polynomial of a graph.[13]
Bipartite graph perfect matchings
The number of perfect matchings of a bipartite graph can be calculated using the principle.[14]
Number of onto functions
Given finite sets A and B, how many surjective functions (onto functions) are there from A to B? Without any loss of generality we may take A = {1, ..., k} and B = {1, ..., n}, since only the cardinalities of the sets matter. By using S as the set of all functions from A to B, and defining, for each i in B, the property P_{i} as "the function misses the element i in B" (i is not in the image of the function), the principle of inclusion–exclusion gives the number of onto functions between A and B as:[15]
Permutations with forbidden positions
A permutation of the set S = {1, ..., n} where each element of S is restricted to not being in certain positions (here the permutation is considered as an ordering of the elements of S) is called a permutation with forbidden positions. For example, with S = {1,2,3,4}, the permutations with the restriction that the element 1 can not be in positions 1 or 3, and the element 2 can not be in position 4 are: 2134, 2143, 3124, 4123, 2341, 2431, 3241, 3421, 4231 and 4321. By letting A_{i} be the set of positions that the element i is not allowed to be in, and the property P_{i} to be the property that a permutation puts element i into a position in A_{i}, the principle of inclusion–exclusion can be used to count the number of permutations which satisfy all the restrictions.[16]
In the given example, there are 12 = 2(3!) permutations with property P_{1}, 6 = 3! permutations with property P_{2} and no permutations have properties P_{3} or P_{4} as there are no restrictions for these two elements. The number of permutations satisfying the restrictions is thus:
 4! − (12 + 6 + 0 + 0) + (4) = 24 − 18 + 4 = 10.
The final 4 in this computation is the number of permutations having both properties P_{1} and P_{2}. There are no other nonzero contributions to the formula.
Stirling numbers of the second kind
The Stirling numbers of the second kind, S(n,k) count the number of partitions of a set of n elements into k nonempty subsets (indistinguishable boxes). An explicit formula for them can be obtained by applying the principle of inclusion–exclusion to a very closely related problem, namely, counting the number of partitions of an nset into k nonempty but distinguishable boxes (ordered nonempty subsets). Using the universal set consisting of all partitions of the nset into k (possibly empty) distinguishable boxes, A_{1}, A_{2}, ..., A_{k}, and the properties P_{i} meaning that the partition has box A_{i} empty, the principle of inclusion–exclusion gives an answer for the related result. Dividing by k! to remove the artificial ordering gives the Stirling number of the second kind:[17]
Rook polynomials
A rook polynomial is the generating function of the number of ways to place nonattacking rooks on a board B that looks like a subset of the squares of a checkerboard; that is, no two rooks may be in the same row or column. The board B is any subset of the squares of a rectangular board with n rows and m columns; we think of it as the squares in which one is allowed to put a rook. The coefficient, r_{k}(B) of x^{k} in the rook polynomial R_{B}(x) is the number of ways k rooks, none of which attacks another, can be arranged in the squares of B. For any board B, there is a complementary board consisting of the squares of the rectangular board that are not in B. This complementary board also has a rook polynomial with coefficients
It is sometimes convenient to be able to calculate the highest coefficient of a rook polynomial in terms of the coefficients of the rook polynomial of the complementary board. Without loss of generality we can assume that n ≤ m, so this coefficient is r_{n}(B). The number of ways to place n nonattacking rooks on the complete n × m "checkerboard" (without regard as to whether the rooks are placed in the squares of the board B) is given by the falling factorial:
Letting P_{i} be the property that an assignment of n nonattacking rooks on the complete board has a rook in column i which is not in a square of the board B, then by the principle of inclusion–exclusion we have:[18]
Euler's phi function
Euler's totient or phi function, φ(n) is an arithmetic function that counts the number of positive integers less than or equal to n that are relatively prime to n. That is, if n is a positive integer, then φ(n) is the number of integers k in the range 1 ≤ k ≤ n which have no common factor with n other than 1. The principle of inclusion–exclusion is used to obtain a formula for φ(n). Let S be the set {1, ..., n} and define the property P_{i} to be that a number in S is divisible by the prime number p_{i}, for 1 ≤ i ≤ r, where the prime factorization of
Then,[19]
Diluted inclusion–exclusion principle
In many cases where the principle could give an exact formula (in particular, counting prime numbers using the sieve of Eratosthenes), the formula arising doesn't offer useful content because the number of terms in it is excessive. If each term individually can be estimated accurately, the accumulation of errors may imply that the inclusion–exclusion formula isn't directly applicable. In number theory, this difficulty was addressed by Viggo Brun. After a slow start, his ideas were taken up by others, and a large variety of sieve methods developed. These for example may try to find upper bounds for the "sieved" sets, rather than an exact formula.
Let A_{1}, ..., A_{n} be arbitrary sets and p_{1}, ..., p_{n} real numbers in the closed unit interval [0,1]. Then, for every even number k in {0, ..., n}, the indicator functions satisfy the inequality:[20]
Proof of main statement
Choose an element contained in the union of all sets and let be the individual sets containing it. (Note that t > 0.) Since the element is counted precisely once by the lefthand side of equation (1), we need to show that it is counted precisely once by the righthand side. On the righthand side, the only nonzero contributions occur when all the subsets in a particular term contain the chosen element, that is, all the subsets are selected from . The contribution is one for each of these sets (plus or minus depending on the term) and therefore is just the (signed) number of these subsets used in the term. We then have:
By the binomial theorem,
Using the fact that and rearranging terms, we have
and so, the chosen element is counted only once by the righthand side of equation (1).
Algebraic proof
An algebraic proof can be obtained using indicator functions (characteristic functions of subsets of a set). The indicator function of a subset S of a set X is a function
If and are two subsets of , then
Let A denote the union of the sets A_{1}, ..., A_{n}. To prove the inclusion–exclusion principle in general, we first verify the identity

(∗)
for indicator functions, where
The following function is identically zero
because: if x is not in A, then all factors are 0 − 0 = 0; and otherwise, if x does belong to some A_{m}, then the corresponding mth factor is 1 − 1 = 0. By expanding the product on the lefthand side, equation (∗) follows.
To prove the inclusion–exclusion principle for the cardinality of sets, sum the equation (∗) over all x in the union of A_{1}, ..., A_{n}. To derive the version used in probability, take the expectation in (∗). In general, integrate the equation (∗) with respect to μ. Always use linearity in these derivations.
See also
Notes
 Roberts & Tesman 2009, pg. 405
 Mazur 2010, pg. 94
 van Lint & Wilson 1992, pg. 77
 van Lint & Wilson 1992, pg. 77
 Stanley 1986, pg. 64
 Rota, GianCarlo (1964), "On the foundations of combinatoial theory I. Theory of Möbius functions", Zeitschrift für Wahrscheinlichkeitstheorie, 2: 340–368, doi:10.1007/BF00531932
 Mazur 2010, pp. 83–4, 88
 Brualdi 2010, pp. 167–8
 Cameron 1994, pg. 78
 Graham, Grötschel & Lovász 1995, pg. 1049
 van Lint & Wilson 1992, pp. 778
 Björklund, Husfeldt & Koivisto 2009
 Gross 2008, pp. 211–13
 Gross 2008, pp. 208–10
 Mazur 2008, pp.845, 90
 Brualdi 2010, pp. 177–81
 Brualdi 2010, pp. 282–7
 Roberts & Tesman 2009, pp.419–20
 van Lint & Wilson 1992, pg. 73
 (Fernández, Fröhlich & Alan D. 1992, Proposition 12.6)
References
 Allenby, R.B.J.T.; Slomson, Alan (2010), How to Count: An Introduction to Combinatorics, Discrete Mathematics and Its Applications (2 ed.), CRC Press, pp. 51–60, ISBN 9781420082609
 Björklund, A.; Husfeldt, T.; Koivisto, M. (2009), "Set partitioning via inclusion–exclusion", SIAM Journal on Computing, 39 (2): 546–563, doi:10.1137/070683933
 Brualdi, Richard A. (2010), Introductory Combinatorics (5th ed.), Prentice–Hall, ISBN 9780136020400
 Cameron, Peter J. (1994), Combinatorics: Topics, Techniques, Algorithms, Cambridge University Press, ISBN 0521457610
 Fernández, Roberto; Fröhlich, Jürg; Alan D., Sokal (1992), Random Walks, Critical Phenomena, and Triviality in Quantum Field Theory, Texts an Monographs in Physics, Berlin: SpringerVerlag, pp. xviii+444, ISBN 3540543589, MR 1219313, Zbl 0761.60061
 Graham, R.L.; Grötschel, M.; Lovász, L. (1995), Hand Book of Combinatorics (volume2), MIT Press – North Holland, ISBN 9780262071710
 Gross, Jonathan L. (2008), Combinatorial Methods with Computer Applications, Chapman&Hall/CRC, ISBN 9781584887430
 Hazewinkel, Michiel, ed. (2001) [1994], "Inclusionandexclusion principle", Encyclopedia of Mathematics, Springer Science+Business Media B.V. / Kluwer Academic Publishers, ISBN 9781556080104
 Mazur, David R. (2010), Combinatorics A Guided Tour, The Mathematical Association of America, ISBN 9780883857625
 Roberts, Fred S.; Tesman, Barry (2009), Applied Combinatorics (2nd ed.), CRC Press, ISBN 9781420099829
 Stanley, Richard P. (1986), Enumerative Combinatorics Volume I, Wadsworth & Brooks/Cole, ISBN 0534065465
 van Lint, J.H.; Wilson, R.M. (1992), A Course in Combinatorics, Cambridge University Press, ISBN 0521422604
This article incorporates material from principle of inclusion–exclusion on PlanetMath, which is licensed under the Creative Commons Attribution/ShareAlike License.