ANOVA

    One-Way Analysis of Variance " ANOVA " is used to compare the means of two or more samples against each other to determine whether it is likely that the samples could come from populations with the same mean. This is similar to a 2-Sample t-Test except that three or more samples can be examined with(ANOVA).

    (ANOVA) can also be used to examine multiple Xs at the same time , but here the focus is primarily on the One-Way (ANOVA), which examines just one X. For example, a Team might need to determine if 3 operators:

    - A single X Operator

    - With 3 levels 3 Operators

        • Take the same amount of time to perform a task. A data sample would be taken.

          For example : 15 points (times in this case) for each operator. ANOVA is used to make the judgment if all the operators' average (mean) task times are the same.

          The level of confidence in the answer depends on how far apart the means of the samples are, how much variability there is in the sample data, and how many data points there are.

          This is shown graphically in Figure below . The upper curves represent the distributions of all three operators' times (known as the populations). The exact nature of each of these distributions is unknown to the Team, because they represent all data points for all time. What the Team can see, however, are the samples taken, one from each population, shown as the lower curves.

          ANOVA examines the sample data with the aim of making an inference on the location of the population means (μ) relative to each other. It does this by breaking down the variation (using variances) in all the sample data into separate pieces, hence the name Analysis Of Variance.

          ANOVA compares the size of the variation between the samples versus the variationwithin the samples.

          Graphical representationof ANOVA.


          Subscribe now to receive
          Our Free E-Book

          Email

          Name

          Then

          Don't worry -- your e-mail address is totally secure.
          I promise to use it only to send you Bexcellence News.

          If the variation between the samples is large relative to the variation within the samples, then it means the samples are spread widely (between) compared with the background noise (within), and this would imply that the likelihood of the means of the parent distributions being aligned is low.

          If the between variation is not large compared with the within variation, then it is likely that the means of the parent distribution are about the same, or more specifically that the test cannot distinguish between them.

          The result of the test would be a degree of confidence (a p-value) that the samples come from populations with the same mean. In practical terms, the p-value gives an indication of the probability that the mean operator times are the same going forward.

          If the p-value is low, then at least one of the mean operator times is distinguishable from the others; if the p-value is high, they all are not distinguishable.

          Roadmap

          The roadmap of the test analysis itself is shown graphically in Figure below

          One-Way ANOVARoadmap.


          Roadmap adapted from SBTI's Process Improvement Methodology training material.

          Step 1.

          Identify the metric and levels to be examined(for example, three operators). Analysis of this kind should be done in the Analyze Phase t, so the metric should be well defined and understood at this point

          Step 2.

          Determine the sample size. This can be as simple as taking the suggested 15 to 20 data points per level or using a sample size calculator in a statistical package. These rely on an equation relating the sample size to

          •  

            The standard deviations (the spread of the data) of each population. This would have to be approximated from historical data.

          •  

            The required power of the test (the likelihood of the test identifying a difference between the means if there truly was one). This is usually set at 0.8 or 80%.

          •  

            The size of the difference δ between the means that is desired to be detected, that is the distance between the means that would lead the Team to say that the two values are different.

          •  

            The alpha level for the test (the likelihood of the test giving a false positive) usually set at 0.05 or 5% and represents the cutoff for the p-value (remember if p is low,H0 must go).

          •  

            The number of levels examined (number of Operators, and so on).

          Step 3.

          Collect a sample data set, one from each level of the X following the rules of good experimentation. If the sample size calculator determined a sample size of ten data points, then ten points need to be collected for each and every level. For example, if the X is Operator and there are three levels (three operators), then 3 x 10 = 30 data points are collected in total.

          Step 4.

          Examine stability of all sample data sets using a Control Chart for each, typically an Individuals and Moving Range Chart (I-MR). AControl Chart identifies whether the processes are stable, having

          •  

            Constant mean (from the Individuals Chart)

          •  

            Predictable variability (from the Range Chart)

          This is important; if the processes are moving around, it is impossible to sensibly decide if they are the same or not.

          Step 5.

          Examine normality of the sample data sets using a Normality Test for each.

          Step 6.

          Perform a Test of Equal Variance on the sample data sets. ANOVA requires the variances of the samples to be approximately the same, and without this, a medians-based approach has to be used instead.

          The Test of Equal Variance uses the sample data sets and has these hypotheses:

          •  

            H0: Population (process)σ12 = σ2232... (all variances equal)

          •  

            Ha: At least one of the Population(process) variances is different

          Step 7.

          Perform the ANOVA if all of the sample data sets were determined to be normal in Step 5 and the variances were equal in Step 6. The hypotheses in this case are

          •  

            H0: Population (process)μ12 = μ2232... (means equal)

          •  

            Ha: At least one of the Population(process) means is different

          •  

            Continue unabated with the ANOVA if the sample size is large enough (>25)

          •  

            Transform the data first and then perform the analysis, again using the ANOVA

          •  

            Perform the median-based equivalent test, a Kruskal-Wallis or Moods Median Test

          Interpreting the output

          Calculates a ratio of the signal(variation due to the X, the "between") relative to the noise (any other variation not due to the X, the "within"). If the signal-to-noise ratio gets large enough then this would be considered to be unlikely to have occurred purely by random chance and the X is thus considered statistically significant.

          This is achieved by looking up the signal-to-noise ratio in a reference distribution (F-Test), which returns a p-value. The p-value represents the likelihood that an effect this large could have occurred purely by random chance even if the populations were aligned.

          Based on the p-values, statements can be generally formed as follows:

          • -Based on the data, I can say that at least one of the means is different and there is a (p-value) chance that I am wrong
          • -Or based on the data, I can say that there is an important effect due to this X and there is a (p-value) chance the result is just due to chance

          Example output from an ANOVA

                • ANOVA results for a comparison of samples of Bob's vs Jane's vs Walt's performance(output from Minitab v14).

                  From the first table in the results:

                  • -The average variation due to Operator was 40.193 units
                  • -The average variation due to Error (everything else not including Operator) was 0.898 units
                  • -The signal-to-noise ratio is therefore 40.193 ÷0.898 = 44.76.
                  • -The likelihood of seeing a signal-to-noise ratio this large (if the populations were perfectly aligned) is 0.000%(p-value), which is well below 0.05, and thus, a conclusion that at least one of the trio is performing significantly differently from the others.
                  • -The X "Operator" explains 50.72% of the variation in the data (the R-Sq value).
                  • -R-Sq (Adj) is close to R-Sq; so there are no redundant terms in the model (if this value drops much lower than R-Sq, which commonly occurs in a multi-way ANOVA, then it is likely that an X is having no effecthere the X clearly has a markedeffect).

                    49.28% of the variation in the data is coming from something other than Operator, and thus, presents a possible opportunity (100% R-Sq).From the bottom table in the results:

                    -A sample of 30 data points was taken for each operator.

                  • -Bob's sample mean is 24.848, Jane's is 25.446, and Walt's is 27.084.
                  • -Bob's sample standard deviation is 0.869, Jane's is 0.988, and Walt's is 0.981.
                  • -The text graph shows the 95% confidence intervals for the locations of the population means for each of the trio.
                  • Thep-value of 0.000% in the upper table indicates that at least one of the trio is performing differently from the other two. There is no overlap in 95% confidence intervals in the bottom table between Walt's performance and the other two; therefore, it is clearly Walt who has a different mean.

                  ANOVA Update

                   

 

FREE eZine. Subscribe now to receive great articles about business excellence, Be the first to know about special events and promotional offers.

Email

Name

Then

Don't worry -- your e-mail address is totally secure.
I promise to use it only to send you Bexcellence News.


 

 

starting-an-online-business-03


 


Add Our RSS Feed

XML RSS
What is this?
Add to My Yahoo!
Add to My MSN
Add to Google


Return to Top


| Copyright | Disclaimer| Privacy Policy| HomePage|


Copyright© 2010 Bexcellence.org