Chi Square
Overview
Chi square is a statistical test that examines variation. It compares variation between two different populations and makes a determination of whether the variation is the same. Chi square helps determine the statistical significance of a relationship between an Attribute X and an Attribute Y in Y = f(X1,X2,..., Xn).

The approach used is to assume the variables (X Data and Y Data) are independent and set up the hypotheses as follows:
*Ho: Data are Independent (Not Related)
*Ha: Data are Dependent (Related)
The output of the test is a "p-value" ndicates the likelihood of seeing a relationship. A high p-value means there is no relationship between the X data andY data. A low p value (< .10) indicates there is a relationship. A low p value cannot be purely by random chance. If the p-value is less than 0.10, then the null hypothesis Ho should be rejected and Ha should be accepted.
As with any statistical test, Chi Square comes with its set of "could be" and "might be" statements.
A Hypothesis Example
The Personnel Department wants to see if there is a link between age and whether an applicant gets hired. Both Age (old and young) and Got Hired (did or didn't) are attribute type data. A Chi Square test would be applicable to answer the question:
*Are age and hiring decisions dependent or independent?
*Ho: Age and Hiring Decisions are independent
*Ha: Age and Hiring Decisions are dependent
455 data points were taken and the data distributed amongst the four possible outcomes. The data can then be analyzed in a statistical package, such as Minitab. The software calculates the expected values in each box. Chi-Square compares the observed with the expected frequencies to produce a signal-to-noise type ratio using :

Where O is the observed frequency and E is the expected frequency in a box.
Sample of Reality for the Relationship between Age and Hiring

The software then looks up the χ2 (the sum of all the discrepancies) value in a statistical table. As a belt using the tool practically, all that is important is the output of the tool, which should be similar to that shown below. The math behind χ2 is not covered here.
Results of the Chi Square test for the relationship between Age and Hiring Practice (output from Minitab v14).
|
Chi Square Test: Hired, Not Hired |
|||
|
Expected counts are printed below observed counts. Chi Square contributions are printed below expected counts |
|||
|
|
Hired |
Not Hired |
Total |
|
1 |
30 |
150 |
180 |
|
|
29.67 |
150.33 |
|
|
|
0.004 |
0.001 |
|
|
2 |
45 |
230 |
275 |
|
|
45.33 |
229.67 |
|
|
|
0.002 |
0.000 |
|
|
Total |
75 |
380 |
455 |
|
Chi-Sq = 0.007, DF = 1, P-Value = 0.932 |
|||
The first place to look is the p-value, which in this case p = 0.932. In this instance, the p-value is not low (not below 0.10). There is no relationship between Age and Hiring Practice is not significant for the sample of data taken.
Other Examples
Chi Square can be applied in virtually any transactional processes where attribute data usually abounds. For example:
*HR Number of sick days by employee or department
*Accounting Number of incorrect expense reports by employee or department
*Sales Number of lost sales by account or region or country
*Logistics Number of deliveries late by distribution center or country
*Call Center Number of missed Customer calls by associate or shift
*Installation Number of repeat service calls by field technician
*Purchasing Number of days delivery-time for orders by supplier
*Inventory Number of parts by distribution center
Roadmap
The roadmap to setting up and applying a Chi Square test is as follows:
|
Step 1. |
Understand the question at hand. There should be a clear relationship in question; does X affect Y? The relationship needs to involve data for both X and Y that is attribute or discrete valued. There should be a business reason for asking the question in the first place, that is the question "Why do we care?". |
|
Step 2. |
Set up the hypotheses in the form:
|
|
Step 3. |
Determine a data collection method and asample size. The sample size is based on the expected values in each box in the data collection table. To have a reasonable confidence in the result of the test, there needs to be an expected value in each box greater than 5. Thus, to calculate sample size, identify the lowest potential proportion likely in any of the boxes. Divide 5 by this proportion to give an approximate bare minimum number of data points to collect. For example, if the expected proportion in one box is 2.5% (0.025), then dividing 5 by 0.025 gives 200 data points.This is a little hit and miss, but gives a ballpark approximation. Typically the approach is to double this number to be on the safe side. The Chi Squared test is data hungry, with sample sizes often above 500. A simple Tally Sheet is enough to capture most test data. Place check marks in the appropriate box as a data points are collected. |
|
Step 4. |
Collect the sample of reality. Ideally the data is available historically, or available quickly in large quantities or otherwise the project might need to idle while data is collected. |
|
Step 5. |
The data is entered in the form of a table into a statistical package and analyzed. |
Interpreting the Output
The first place to look during analysis is to the p-value. If the p-value is higher than 0.10 then the conclusion is that the X and Y are not dependent based on the taken sample.
However, if the p-value is low (less than 0.10) then there is reason to believe that the X and Y are dependent in some way. If everything was based on random chance, the distribution of data points within the table isn't expected
Low P Value
Consider the data below, which represents loan approval or rejection decisions on different days of the week. The bank clearly would like loan decisions to be independent of the processed work day.
Loan Decision Data by Day of Week
|
|
Rejected |
Approved |
|---|---|---|
|
Monday |
9 |
27 |
|
Tuesday |
8 |
21 |
|
Wednesday |
11 |
25 |
|
Thursday |
7 |
24 |
|
Friday |
25 |
23 |
Chi Square Test analysis results for loan data (output from Minitab v14).

Looking immediately to the p-value of 0.028 (less than 0.10), it is clear that there is a dependent relationship. Thus, the conclusion is that the null hypothesis should be rejected and the alternate "Data are dependent" should be accepted instead. In English, we conclude there is something fishy going on, because the chances of getting a loan varies by day of the week.
After learning about Chi square, see this page for other quality tools.










