Besides descriptive data, HR can use other methods to interrogate data in more detail. For example, HR might explore a dataset of gender and job role, and find that as job role status increases, the number of female employees decreases. See the figures below. To ensure that the finding is not just a coincidence, HR can verify it with a chi-square test. The chi-square test can prove if the dataset contains a statistically significant pattern.

**The chi-square test**

The chi-square test is a simple test to tell the analyst how certain that a pattern is statistically meaningful. In our example, it will tell us if there is an association between gender and seniority in the organization. In other words, it will inform the reader if there is evidence that women are under-represented in senior roles, or a gender bias might exist in the organization.

Our null hypothesis in this example is this: there is no link between gender and job role, or there is no pattern of gender variation across the job roles. The alternative hypothesis is that there is a pattern in our gender proportions that is meaningful; there is a statistically significant link between gender and job role. The significance level to reject the null hypothesis is set to 0.05.

In R, it’s done in one simple line of code. The test result looks like the following:

> chisq.test(emp.table) Pearson's Chi-squared test data: emp.table X-squared = 164.7, df = 7, p-value < 2.2e-16

The key statistics of interest here are the Pearson’s Chi-squared test of 164.7 and the significance statistic of 2.2e-16. Simply put, the bigger the bias, the bigger the chi-square statistic will be. The p-value is well below the significance level of 0.05. Therefore, we cannot accept the null hypothesis; there is a statistically significance between gender and job role.

Complete data file & source code in Github