22 Hypergeometric Distribution

22.1 Probability Mass Function

A random variable \(X\) is said to follow a Hypergeometric Distribution if its probability mass function is

\[p(x) = \left\{ \begin{array}{ll} \frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}}, & x=0,1,2,\ldots\\ 0 & otherwise \end{array} \right. \]

where

  • \(N\) is the number of objects available to choose from
  • \(n\) is the number of objects chosen from \(N\)
  • \(r\) is the number of objects in \(N\) that posses a desired characteristic (successes)
  • \(x\) is the number of objects in \(n\) that possess the desired characterstic

22.2 Cumulative Mass Function

\[P(x) = \left\{ \begin{array}{ll} \sum\limits_{i=0}^{x}\frac{{r\choose i}{N-r\choose n-i}}{{N\choose n}}, & x=0,1,2,\ldots\\ 0 & otherwise \end{array} \right. \]

22.3 Expected Values

\[\begin{aligned} E(X) &= \sum\limits_{x=0}^{n}x\frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}} \\ &= \sum\limits_{x=0}^{n}x{r\choose x}\frac{{N-r\choose n-x}}{{N\choose n}} \\ ^{[1]} &= \sum\limits_{x=0}^{n}x\frac{r}{x}{r-1\choose x-1}\frac{{N-r\choose n-x}} {\frac{N}{n}{N-1\choose n-1}} \\ &= \frac{rn}{N}\sum\limits_{x=0}^{n} \frac{{r-1\choose x-1}{N-r\choose n-x}}{{N-1\choose n-1}} \\ ^{[2]} &= \frac{rn}{N}\sum\limits_{y=0}^{n-1} \frac{{r-1\choose y}{N-r\choose n-y-1}}{{N-1\choose n-1}} \frac{rn}{N}\sum\limits_{y=0}^{n-1} \frac{{r-1\choose y}{N-r\choose n-y-1}}{{N-1\choose n-1}} \\ &= \frac{\frac{rn}{N}\cdot n}{N}r\sum\limits_{y=0}^{n-1} \frac{{r-1\choose y}{N-r\choose n-y-1}}{{N-1\choose n-1}} \\ ^{[3]} &= \frac{rn}{N}\cdot 1 \\ &= \frac{rn}{N} \end{aligned}\]

  1. For any integer \(a\) such that \(0\leq a\leq k,\ {n\choose k}=\frac{n(n-1)\cdots(n-a+1)}{k(k-1)\cdots(k-a+1)}{n-a\choose k-a}\) (Theorem 10.0.3).
  2. Let \(y=x-1\)  \(\Rightarrow x=y+1\).
  3. \(\frac{\sum\limits_{i=1}^{n}{N_1\choose i}{N_2\choose n-i}}{{N_1+N_2\choose n}}=1\)\ with \(N_1=r,\ N_2=N-r,\ i=x\). (Theorem 6.3.2

\[\begin{aligned} E[X(X-1)] &= \sum\limits_{x=0}^{n}x(x-1)\frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}} \\ ^{[1]} &= \sum\limits_{x=0}^{n}\frac{x(x-1)r(r-1)}{x(x-1)}\frac{{r-2\choose x-2} {N-r\choose n-x}}{\frac{N(N-1)}{n(n-1)}{N-2\choose n-2}} \\ &= \frac{r(r-1)n(n-1)}{N(N-1)}\sum\limits_{x=0}^{n} \frac{{r-2\choose x-2}{N-r\choose n-x}}{{N-2\choose n-2}} \\ ^{[2]} &= \frac{r(r-1)n(n-1)}{N(N-1)}\sum\limits_{y=0}^{n-2} \frac{{r-2\choose y}{N-r\choose n-y-2}}{{N-2\choose n-2}} \\ ^{[3]} &= \frac{r(r-1)n(n-1)}{N(N-1)}\cdot 1 \\ &=\frac{r(r-1)n(n-1)}{N(N-1)} \end{aligned}\]

  1. For any integer \(a\) such that \(0\leq a\leq k,\ {n\choose k}=\frac{n(n-1)\cdots(n-a+1)}{k(k-1)\cdots(k-a+1)}{n-a\choose k-a}\) (Theorem 10.0.3).
  2. Let \(y=x-1\)  \(\Rightarrow x=y+1\).
  3. \(\frac{\sum\limits_{i=1}^{n}{N_1\choose i}{N_2\choose n-i}}{{N_1+N_2\choose n}}=1\)\ with \(N_1=r,\ N_2=N-r,\ i=x\). (Theorem 6.3.2

\[\begin{aligned} \mu &= E(X) \\ &= \frac{rn}{N} \\ \\ \\ \sigma^2 &= E(X^2) - E(X)^2 \\ &= E(X^2) - E(X) + E(X) - E(X)^2 \\ &= (E(X^2) - E(X) + E(X) - E(X)^2 \\ &= E(X^2-X) + E(X) - E(X)^2 \\ &= E[X(X-1)] + E(X) - E(X)^2\\ &= \frac{r(r-1)n(n-1)}{N(N-1)} + \frac{rn}{N} - \frac{r^2n^2}{N^2} \\ &= \frac{r(r-1)n(n-1)N}{N^2(N-1)} + \frac{rnN(N-1)}{N^2(N-1)} - \frac{r^2n^2(N-1)}{N^2(N-1)} \\ &= \frac{(r^2-r)(n^2-n)N rn(N^2-N)-r^2n^2(N-1)}{N^2(N-1)} \\ &= \frac{(r^2n^2N-r^2n^2N-rn^2N+rnN+rnN^2-rnN-r^2n^2N+r^2n^2}{N^2(N-1)} \\ &= \frac{-r^2nN-rn^2N+rnN^2+r^2n^2}{N^2(N-1)} \\ &= \frac{nr(-rN-nN+N^2+rn}{N^2(N-1)} \\ &= \frac{nr(N^2-nN-rN+rn}{N^2(N-1)} \\ &= \frac{nr(N-r)(N-n)}{N^2(N-1)} \\ &= \frac{nr(N-r)(N-n)}{N\cdot N(N-1)} \\ &= \frac{nr}{N}\cdot\frac{N-r}{N}\cdot\frac{N-n}{N-1} \end{aligned}\]

22.4 Moment Generating Function

\[\begin{aligned} M_X(t) &= E(e^{tX}) \\ &= \sum\limits_{x=0}^{n}e^{tx}\frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}} \\ &= \frac{1}{{N\choose n}}\sum\limits_{x=0}^{n}e^{tx}{r\choose x}{N-r\choose n-x} \\ &= \frac{1}{{N\choose n}}[e^{0t}{r\choose 0}{N-r\choose n-0} + e^{1t}{r\choose 1}{N-r\choose n-1} + e^{2t}{r\choose 2}{N-r\choose n-2} + \cdots + e^{nt}{r\choose n}{N-r\choose n-n}] \\ &= \frac{1}{{N\choose n}}[{N-r\choose n-0}+e^{t}{r\choose 1}{N-r\choose n-1} + e^{2t}{r\choose 2}{N-r\choose n-2}+\cdots+e^{nt}{r\choose n}{N-r\choose n-n}] \end{aligned}\]

This mgf does not reduce to any form which can be differentiated, and we cannot use it to generate moments for the distribution.

22.5 Theorems for the Hypergeometric Distribution

22.5.1 Validity of the Distribution

\[ \sum\limits_{x=0}^{n}\frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}} = 1 \]

Proof:

Theorem 6.3.1 states

\[ {N_1\choose 0}{N_2\choose n}+{N_1\choose 2}{N_2\choose n-1}+\cdots +{N_1\choose n-1}{N_2\choose 1}+{N_1\choose n}{N_2\choose 0} \\ = \sum\limits_{x=0}^{n}{N_1\choose x}{N_2\choose n-x} \\ = {N_1+N_2\choose n} \]

Using \(N_1 = r\) and \(N_2 = N-r\) we have \[\begin{aligned} \sum\limits_{x=0}^{n}\frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}} &= \frac{{r+N-r\choose n}}{{N\choose n}} \\ &= \frac{{N\choose n}}{{N\choose n}} \\ &= 1 \end{aligned}\]