ANOVA
OneWay
Analysis of Variance " ANOVA " is used to compare the means of two or
more samples against each other. This calculation determines whether it is likely that
the samples came from populations with the same mean. This is
similar to a 2Sample tTest except that three or more samples can be
examined with ANOVA.
(ANOVA)
can also be used to examine multiple variables and levels at the same time, but here the
focus is primarily on the OneWay (ANOVA). OneWay examines just one variable and multiple levels.
For example, a team might need to determine if 3 operators are different
 A single variable is the Operator
 With 3 levels or 3 Operators
 Measure amount of time to perform a task. Measure each operator several times.
For example
Measure 15 points for each operator to preform a task. Use ANOVA to make
the judgment and see if all the operators' average (mean) task times are the
same
Level of Confidence
You will need to determine the the level of confidence, such as 90% or 95% for the calculaton. This depends your required level of certainity from the analysis of variation calculation.
Explaination of Anova
This is shown graphically in the Figure below . The upper curves represent the distributions of the three operators' times (known as the populations). The exact nature of these distributions is unknown to the team, because they represent all data points for all time. However, the team can see the sample's distrubution. Shown as the lower curves.
ANOVA examines the sample data with the aim of making an inference on
the location of the population means (μ) relative to each other. It
does this by breaking down the variation (using variances) in all the
sample data into separate pieces, hence the name Analysis Of Variance.
ANOVA compares the size of the variation between the samples versus the variationwithin the samples.
Graphical representationof ANOVA.
If
the variation between the samples is large relative to the variation
within the samples, then it means the samples are spread widely
(between) compared with the background noise (within). This would
imply that the means of the parent distributions
are different
If the between variation is not large compared to the within
variation, then it is likely that the means of the parent distribution
are about the same. More specifically the test cannot
distinguish between them.
The
result of the test is a number called the pvalue, which stands for probability. A high pvalue means the
samples come from populations with the same mean. The reverse is also true. A low pvalue tells us the populations are significantly different. In our example
the pvalue tells us the probability that the mean
operator times are the same or different. If
the pvalue is low, then at least one of the mean operator times is
distinguishable from the others; if the pvalue is high, they all are
not distinguishable.
Roadmap
The roadmap of the test analysis itself is shown graphically in Figure below
OneWay ANOVA Roadmap.
Roadmap adapted from SBTI's Process Improvement Methodology training material.
Step 1. 
Identify
the metric and levels to be examined (for example, three operators).
Make the
metric well defined and understood by the team. 
Step 2. 
Determine the sample size. Use a sample size calculator . 
Step 3. 
Collect
the sample data set, one from each level of the variable. Follow the rules of
good experimentation. If the sample size calculator determined a sample
size of ten data points, then ten points need to be collected for each
and every level. For example, if the variable is operator and there are three
levels (three operators), then 3 x 10 = 30 data points are collected in
total. 
Step 4. 
Examine stability of all sample data sets using a Control Chart for each, typically an Individuals and Moving Range Chart (IMR). A Control Chart identifies whether the processes are stable, having This is important; if the processes are not stable, Then the study will give an incorrect answer. 
Step 5. 
Examine normality of the sample data sets using a Normality Test for each. 


Step 6. 
Perform
the ANOVA if all of the sample data sets were determined to be normal
in Step 5 
Anova Calculations
It is beyond the scope of this page to show you the calculations. We recommend using Minitab to easily conduct the calculations. Below we discuss how to intrepret the results of the calculations.
Interpreting the output
This test calculates a ratio of the signal (variation due to the variable, the "between") relative to the noise (any other variation not due to the variation, the "within"). If the signaltonoise ratio gets large enough then this would be considered to be unlikely to have occurred purely by random chance and the variable is thus considered statistically significant.
This
is achieved by looking up the signaltonoise ratio in a reference
distribution (FTest), which returns a pvalue. The pvalue represents
the likelihood that an effect this large could have occurred purely by
random chance even if the populations were the same
Based on the pvalues, statements can be generally formed as follows:
Based on the data, I can say that at least one of the means is different and there is a (pvalue) chance that I am wrong
Or based on the data, I can say that there is an important effect due to this X and there is a (pvalue) chance the result is just due to chance
Example output from an ANOVA
ANOVA results for a comparison of samples of Bob's vs Jane's vs Walt's performance (output from Minitab v14).
From the first table in the results:
The average variation due to Operator (between variation or the signal) was 40.193 units
The average variation due to Error (within variation or the noise ) was 0.898 units
The signaltonoise ratio is therefore 40.193 ÷0.898 = 44.76.
The
likelihood of seeing a signaltonoise ratio this large (if the
populations were perfectly aligned) is 0.000%(pvalue). This is well
below 0.05 (95% confidence interval). You can conclude that at least one of the trio is
performing significantly differently from the others.
A sample of 30 data points was taken for each operator.
Bob's sample mean is 24.848, Jane's is 25.446, and Walt's is 27.084.
Bob's sample standard deviation is 0.869, Jane's is 0.988, and Walt's is 0.981.
The text graph shows the 95% confidence intervals for the locations of the population means for each of the trio.
The pvalue of 0.000% in the upper table indicates that at least one of the trio is performing differently from the other two. There is no overlap in 95% confidence intervals in the bottom table between Walt's performance and the other two; therefore, it is clearly Walt who has a different mean.
After reading this article on ANOVA click here to review other Total Quality Management tools