39 Analysis of Variance

39.1 One-Way Design

39.1.1 Decomposition of Sums of Squares

\[\begin{aligned} SS_{Total} &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{++})^2 \\ &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+} + \bar x_{i+} - x_{++})^2 \\ &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} \big[ (x_{ij} - \bar x{i+}) + (\bar x_{i+} - \bar x_{++}) \big]^2 \\ &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} \big[ (\bar x_{i+} - \bar x_{++}) + (x_{ij} - \bar x{i+}) \big]^2 \\ &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} \big[ (\bar x_{i+} - \bar x_{++})^2 + 2(\bar x_{i+} - \bar x_{++})(x_{ij} - \bar x_{i+}) + (x_{ij} - \bar x_{i+})^2 \big] \\ &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (\bar x_{i+} - \bar x_{++})^2 + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} 2(\bar x_{i+} - \bar x_{++})(x_{ij} - \bar x_{i+}) + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+})^2 \\ &= \sum\limits_{i=1}^{a} n_i(\bar x_{i+} - \bar x_{++})^2 + 2\sum\limits_{i=1}^{a} n_i (\bar x_{i+} - \bar x_{++}) \sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+}) + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+})^2 \\ &= \sum\limits_{i=1}^{a} n_i(\bar x_{i+} - \bar x_{++})^2 + 2\sum\limits_{i=1}^{a} n_i (\bar x_{i+} - \bar x_{++}) \bigg(\sum\limits_{j=1}^{n_i} x_{ij} - \sum\limits_{j=1}^{n_i}\bar x_{i+}\bigg) + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+})^2 \\ &= \sum\limits_{i=1}^{a} n_i(\bar x_{i+} - \bar x_{++})^2 + 2\sum\limits_{i=1}^{a} n_i (\bar x_{i+} - \bar x_{++}) (x_{i+} - n_i \bar x_{i+}) + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+})^2 \\ &= \sum\limits_{i=1}^{a} n_i(\bar x_{i+} - \bar x_{++})^2 + 2\sum\limits_{i=1}^{a} n_i (\bar x_{i+} - \bar x_{++}) \big(x_{i+} - n_i \frac{x_{i+}}{n_i}\big) + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+})^2 \\ &= \sum\limits_{i=1}^{a} n_i(\bar x_{i+} - \bar x_{++})^2 + 2\sum\limits_{i=1}^{a} n_i (\bar x_{i+} - \bar x_{++}) (x_{i+} - x_{i+}) + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+})^2 \\ &= \sum\limits_{i=1}^{a} n_i(\bar x_{i+} - \bar x_{++})^2 + 2\sum\limits_{i=1}^{a} n_i (\bar x_{i+} - \bar x_{++}) \cdot 0 + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+})^2 \\ &= \sum\limits_{i=1}^{a} n_i(\bar x_{i+} - \bar x_{++})^2 + 0 + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+})^2 \\ &= \sum\limits_{i=1}^{a} n_i(\bar x_{i+} - \bar x_{++})^2 + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+})^2\\ \end{aligned}\]

The components are commonly referred to as

\[ SS_{Factor} = \sum\limits_{i=1}^{a} n_i(\bar x_{i+} - \bar x_{++})^2 \]

and

\[ SS_{Error} = \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{i+})^2 \]

Notice that \(SS_{Factor}\) compares the factor means to the overall mean, and it can be said that \(SS_{Factor}\) measures the variation between factors. \(SS_{Error}\) compares each observation to the overall mean, and can be said to describe the variation within factors.

When \(n_1 = n_2 = \cdots n_i = n\), the design is said to be balanced.

See (Montgomery 2004, 66)

39.2 Computational Formulas

\(SS_{Total}\) and \(SS_{Factor}\) can be simplified for convenient computation.

\[\begin{aligned} SS_{Total} &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} (x_{ij} - \bar x_{++})^2 \\ ^{[1]} &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{n_i} x_{ij}^2 - x_{++} \sum\limits_{j=1}^{n_i}\frac{1}{n_i}\\ \end{aligned}\]

  1. See Theorem 43.3.1

\[\begin{aligned} SS_{Factor} &= \sum\limits_{i=1}^{a} n_i(\bar x_{i+} - \bar x_{++})^2 \\ ^{[1]} &= \sum\limits_{i=1}^{a}\frac{\bar x_{i+}^2}{n_i} - \bar x_{++} \sum\limits_{i=1}^{a}\frac{1}{n_i} \end{aligned}\]

  1. See Theorem 43.3.1

\(SS_{Error}\) does not simplify to a convenient form, but
\[\begin{aligned} SS_{Total} &= SS_{Factor} + SS_{Error} \\ \Rightarrow SS_{Error} &= SS_{Total} - SS_{Factor} \end{aligned}\]

39.3 Randomized Complete Block Design

Blocking in ANOVA is a method of eliminate the effect of a controllable nuisance variable. To implement this design, suppose we have \(a\) treatments we want to compare, and \(b\) blocks. We may analyze the data by use of the sums of squares, similar to the one-way design.

39.3.1 Decomposition of Sums of Squares

\[\begin{aligned} SS_{Total} &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}(x_{ij} - \bar x_{++})^2 \\ &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}(x_{ij} + \bar x_{i+} - \bar x_{i+} + \bar x_{+ j} - \bar x_{+ j} + \bar x_{++} - \bar x_{+ +} - \bar x_{++})^2 \\ &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\big[ (\bar x_{i+} - \bar x_{++}) + (\bar x_{+ j} - \bar x_{++}) + (x_{ij} - \bar x_{i+} - \bar x_{+ j} + \bar x_{++}) \big]^2 \\ &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\big[ (\bar x_{i+} - \bar x_{++})^2 + 2(\bar x_{i+} - \bar x_{++})(\bar x_{+ j} - \bar x_{++}) \\ & \ \ \ \ + 2(\bar x_{i+} - \bar x_{++})(x_{ij} - \bar x_{i+} - \bar x_{+ j} + \bar x_{++}) + (\bar x_{+ j} - \bar x_{++})^2 \\ & \ \ \ \ + 2(\bar x_{+ j} - \bar x_{++}) (x_{ij} - \bar x_{i+} - \bar x_{+ j} + \bar x_{++}) \\ & \ \ \ \ + (x_{ij} - \bar x_{i+} - \bar x_{+ j} + \bar x_{++})^2 \big] \\ &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\big[ (\bar x_{i+} - \bar x_{++})^2 + (\bar x_{+ j} - \bar x_{++})^2 + (x_{ij} - \bar x_{i+} - \bar x_{+ j} + \bar x_{++})^2 \\ & \ \ \ \ + 2(\bar x_{i+} - \bar x_{++})(\bar x_{+ j} - \bar x_{++}) + 2(\bar x_{i+} - \bar x_{++})(x_{ij} - \bar x_{i+} - \bar x_{+ j} + \bar x_{++}) \\ & \ \ \ \ + 2(\bar x_{+ j} - \bar x_{++})(x_{ij} - \bar x_{i+} - \bar x_{+ j} + \bar x_{++}) \big] \\ ^{[1]} &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\big[ (\bar x_{i+} - \bar x_{++})^2 + (\bar x_{+ j} - \bar x_{++})^2 + (x_{ij} - \bar x_{i+} - \bar x_{+ j} + \bar x_{++})^2 + 0 + 0 + 0 \big] \\ &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (\bar x_{i+} - \bar x_{++})^2 + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (\bar x_{+ j} - \bar x_{++})^2 + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (x_{ij} - \bar x_{i+} - \bar x_{+ j} + \bar x_{++})^2 \\ &= b \sum\limits_{i=1}^{a} (\bar x_{i+} - \bar x_{++})^2 + a \sum\limits_{j=1}^{b} (\bar x_{+ j} - \bar x_{++})^2 + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (x_{ij} - \bar x_{i+} - \bar x_{+ j} + \bar x_{++})^2 \end{aligned}\]

  1. It is shown that the cross products are equal to zero in Section 2.3.3

These terms are commonly referred to as \[\begin{aligned} SS_{Factor} &= b \sum\limits_{i=1}^{a} (\bar x_{i+} - \bar x_{++})^2 \\ SS_{Block} &= a \sum\limits_{j=1}^{b} (\bar x_{+ j} - \bar x_{++})^2 \\ SS_{Error} &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (x_{ij} - \bar x_{i+} - \bar x_{+ j} + \bar x_{++})^2 \end{aligned}\]

See (Montgomery 2004, 126)

39.3.2 Computational Formulae

\(SS_{Total}\), \(SS_{Factor}\), and \(SS_{Block}\) can all be simplified for convenient computation.

\[\begin{aligned} SS_{Total} &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}(x_{ij} - \bar x_{++})^2 \\ ^{[1]} &= \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} x_{ij}^2 - \frac{x_{++}}{ab}\\ \\ SS_{Factor} &= b \sum\limits_{i=1}^{a} (\bar x_{i+} - \bar x_{++})^2 \\ ^{[1]} &= \frac{1}{b}\sum\limits_{i=1}^{a}x_{i+}^2 - \frac{x_{++}^2}{ab} \\ \\ SS_{Block} &= a \sum\limits_{j=1}^{b} (\bar x_{+ j} - \bar x_{++})^2 \\ ^{[1]} &= \frac{1}{a}\sum\limits_{j=1}^{b} x_{+ j}^2 - \frac{x_{++}^2}{ab} \end{aligned}\]

  1. See Theorem 43.3.1

\(SS_{Error}\) does not simplify to any convenient form, but may be calculated from the other terms as
\(SS_{Error} = SS_{Total} - SS_{Factor} - SS_{Block}\)

39.3.3 RCBD Cross Products

The cross products of the RCBD design

\[\begin{aligned} 2\sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (\bar x_{i+}-\bar x_{++}) (\bar x_{+ j}-\bar x{++}) & \\ \ \ \ \ + 2\sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (\bar x_{+ j}-\bar x_{++}) (x_{ij} + \bar x_{i+} + \bar x_{+ j} - \bar x_{++}) & \\ \ \ \ \ + 2\sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (\bar x_{+ i}-\bar x_{++}) (x_{ij} + \bar x_{i+} + \bar x_{+ j} - \bar x_{++}) &= 0 \end{aligned}\]

Proof:

\[\begin{aligned} & 2\sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (\bar x_{i+}-\bar x_{++}) (\bar x_{+ j}-\bar x{++}) + 2\sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (\bar x_{+ j}-\bar x_{++}) (x_{ij} + \bar x_{i+} + \bar x_{+ j} - \bar x_{++}) \\ &\ \ \ \ + 2\sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (\bar x_{+ i}-\bar x_{++}) (x_{ij} + \bar x_{i+} + \bar x_{+ j} - \bar x_{++})\\ =&\ 2\bigg(\sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (\bar x_{i+}-\bar x_{++}) (\bar x_{+ j}-\bar x{++}) + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (\bar x_{+ j}-\bar x_{++}) (x_{ij} + \bar x_{i+} + \bar x_{+ j} - \bar x_{++}) \\ &\ \ \ \ + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b} (\bar x_{+ i}-\bar x_{++}) (x_{ij} + \bar x_{i+} + \bar x_{+ j} - \bar x_{++})\bigg)\\ =&\ 2\sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\big[ (\bar x_{i+}-\bar x_{++}) (\bar x_{+ j}-\bar x{++}) + (\bar x_{+ j}-\bar x_{++}) (x_{ij} + \bar x_{i+} + \bar x_{+ j} - \bar x_{++}) \\ &\ \ \ \ + (\bar x_{+ i}-\bar x_{++}) (x_{ij} + \bar x_{i+} + \bar x_{+ j} - \bar x_{++}) \big] \\ =&\ 2\sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\big[ \bar x_{i+}\bar x_{+ j} - \bar x_{i+}\bar x_{++} - \bar x_{+ j}\bar x_{++} + \bar x_{++}^2 \\ &\ \ \ \ + x_{ij}\bar x_{+ j} - \bar x_{i +}\bar x_{+ j} - \bar x_{+ j}^2 + \bar x_{+ j}\bar x_{++} - x_{ij}\bar x_{++} + \bar x_{i+}\bar x_{++} + \bar x_{+ j}\bar x_{++} - \bar x_{++}^2 \\ &\ \ \ \ + x_{ij}\bar x_{+ j} - \bar x_{i +}^2 - \bar x_{i+}\bar x_{+ j} + \bar x_{+ j}\bar x_{++} - x_{ij}\bar x_{++} + \bar x_{i+}\bar x_{++} + \bar x_{+ j}\bar x_{++} - \bar x_{++}^2 \big] \\ =&\ 2\sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}( -\bar x_{++}^2 - \bar x_{i+}^2 - \bar x_{+ j}^2 + x_{ij}\bar x_{i+} + x_{ij}\bar x_{+ j} - 2 x_{ij}\bar x_{++} - \bar x_{i+}\bar x_{+ j} \\ &\ \ \ \ + 2\bar x_{i+}\bar x_{++} + 2\bar x_{+ j}\bar x_{++} ) \\ =&\ 2\bigg(-\sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\bar x_{++}^2 - \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\bar x_{i+}^2 - \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2 + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}x_{ij}\bar x_{i+} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}x_{ij}\bar x_{+ j} \\ &\ \ \ \ - \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2 x_{ij}\bar x_{++} - \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\bar x_{i+}\bar x_{+ j} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2\bar x_{i+}\bar x_{++} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2\bar x_{+ j}\bar x_{++} \bigg) \\ =&\ 2\bigg( \frac{ab\bar x_{++}^2}{a^2b^2} - \frac{b}{b^2}\sum\limits_{i=1}^{a}\bar x_{i+}^2 - \frac{a}{a^2}\sum\limits_{j=1}^{b}\bar x_{+ j}^2 + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}x_{ij}\bar x_{i+} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}x_{ij}\bar x_{+ j}\\ &\ \ \ \ - \frac{2\bar x{++}^2}{ab} - \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\bar x_{i+}\bar x_{+ j} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2\bar x_{i+}\bar x_{++} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2\bar x_{+ j}\bar x_{++} \bigg)\\ ^{[1]} =&\ 2\bigg( \frac{\bar x_{++}^2}{ab} - \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 - \frac{1}{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2 + \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}x_{ij}\bar x_{+ j}\\ &\ \ \ \ - \frac{2\bar x{++}^2}{ab} - \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\bar x_{i+}\bar x_{+ j} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2\bar x_{i+}\bar x_{++} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2\bar x_{+ j}\bar x_{++} \bigg)\\ ^{[2]} =&\ 2\bigg( \frac{\bar x_{++}^2}{ab} - \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 - \frac{1}{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2 + \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 + \frac{1}{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2\\ &\ \ \ \ - \frac{2\bar x{++}^2}{ab} - \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}\bar x_{i+}\bar x_{+ j} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2\bar x_{i+}\bar x_{++} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2\bar x_{+ j}\bar x_{++} \bigg)\\ ^{[3]} =&\ 2\bigg( \frac{\bar x_{++}^2}{ab} - \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 - \frac{1}{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2 + \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 + \frac{1}{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2\\ &\ \ \ \ - \frac{2\bar x{++}^2}{ab} - \frac{\bar x_{++}}{ab} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2\bar x_{i+}\bar x_{++} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2\bar x_{+ j}\bar x_{++} \bigg) \\ ^{[4]} =&\ 2\bigg( \frac{\bar x_{++}^2}{ab} - \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 - \frac{1}{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2 + \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 + \frac{1}{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2 \end{aligned}\]

\[ \ \ \ \ - \frac{2\bar x{++}^2}{ab} - \frac{\bar x_{++}}{ab} + \frac{2\bar x_{++}^2}{ab} + \sum\limits_{i=1}^{a}\sum\limits_{j=1}^{b}2\bar x_{+ j}\bar x_{++} \bigg)\\ \ \ ^{[5]} = 2\bigg( \frac{\bar x_{++}^2}{ab} - \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 - \frac{1}{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2 + \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 + \frac{1}{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2\\ \ \ \ \ - \frac{2\bar x{++}^2}{ab} - \frac{\bar x_{++}}{ab} + \frac{2\bar x_{++}^2}{ab} + \frac{2\bar x_{++}^2}{ab} \bigg)\\ \ \ = 2\bigg(\frac{4\bar x_{++}^2}{ab} - \frac{4\bar x_{++}^2}{ab} + \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 - \frac{1}{b}\sum\limits_{i=1}^{a}\bar x_{i+}^2 + \frac{1}{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2 - \frac{1}{a}\sum\limits_{j=1}^{b}\bar x_{+ j}^2 \bigg)\\ \ \ = 2(0 + 0 + 0) \\ = 2(0) \\ = 0 \]

  1. See Summation Theorem 38.1.7
  2. See Summation Theorem 38.1.8
  3. See Summation Theorem 38.1.4
  4. See Summation Theorem 38.1.5
  5. See Summation Theorem 38.1.6

Using the theorems in Chapter @ref{summation-chapter} it is can be shown that each of the three cross products is equal to zero. However, the physical tedium of reducing each cross product is much greater than the approach taken above.

References

Montgomery, Douglas C. 2004. Design and Analysis of Experiments. 5th ed. John Wiley & Sons, Inc.