In this post we will predict the factors that affect the individual employee turnover. Because our response variable is the turnover status, and its value in the dataset is binary type (1=leave, 0=stay), we will apply logistic regression to investigate the data.

The analysis produces three types of information. The first one is called Nagelkerke R-square statistics. Nagelkerke R-square figure is an indicator of the degree of variance in the dependent variable that is accounted for by the variation in our predictor variables. Here it says that from the six predictor variables, we can account for 5.6% of the turnover across the organization. While it seems a small percentage, we need to consider all of the possible causes of people leaving an organization.

> PseudoR2(logit.ind.tnvr) McFadden Adj.McFadden Cox.SnellNagelkerkeMcKelvey.Zavoina 3.971238e-02 1.419408e-02 2.973055e-025.584941e-027.417359e-02 Effron Count Adj.Count AIC Corrected.AIC 3.120676e-02 NA NA 1.234203e+03 1.234497e+03

The second set of figures are the coefficients of the predictor variables. We can determine the significant variables from their p-value. In our case, “gender” and “appraisal rating” are the factors because their p-value are less than 0.05.

Call: glm(formula = LeaverStatus ~ ., family = binomial(link = "logit"), data = emp) Deviance Residuals: Min 1Q Median 3Q Max -1.1041 -0.5538 -0.4593 -0.3760 2.3990 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.168371 0.656593 0.256 0.797618 BossGender -0.183855 0.163275 -1.126 0.260147Gender -0.608303 0.161704 -3.762 0.000169 ***Age -0.001347 0.009392 -0.143 0.885962 LengthOfService -0.017194 0.010900 -1.577 0.114683AppraisalRating -0.248002 0.097284 -2.549 0.010795 *CountryBelgium -0.249247 0.603715 -0.413 0.679712 CountrySweden 0.001673 0.593760 0.003 0.997752 CountryItaly -0.058845 0.516308 -0.114 0.909259 CountryFrance -0.455198 0.624276 -0.729 0.465903 CountryPoland -0.734433 0.648954 -1.132 0.257753 CountryMexico -0.415403 0.596774 -0.696 0.486379 CountrySpain -0.571556 0.535051 -1.068 0.285418 CountryUK -0.505547 0.470520 -1.074 0.282624 CountryUS -0.698382 0.433822 -1.610 0.107433 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1254.0 on 1649 degrees of freedom Residual deviance: 1204.2 on 1635 degrees of freedom (3 observations deleted due to missingness) AIC: 1234.2 Number of Fisher Scoring iterations: 5

The third statistic figure it produces is odds ratio (the exp() function below). Odds ratio is the ratio of the odds of the event occurring to it not occurring. In the turnover example, the odds ratio is the likelihood or odds of leaver status = 1 if the value of a predictor variable is increased by one unit.

For example, with “Gender”, the odds ratio is 0.544. It means that the probability of leaving / probability of staying = 0.544 if the gender = 1 (men). Because the odds ratio is less than 1, we know that the probability of staying is greater than the probability of leaving if the employee is man. On the other hand, women are more likely to leave. We could present this in a number of ways:

- The odds of men leaving the organization is 0.544 to 1.
- The odds of men staying the organization is 1.838 (1/0.544) to 1.
- Women are 1.838 times more likely to leave the organization than men.

> exp(logit.ind.tnvr$coefficients) (Intercept) BossGenderGenderAge LengthOfService 0.9223082 0.83205620.54427370.9986539 0.9829526AppraisalRatingCountrySweden CountryItaly CountryFrance CountryPoland0.78035821.2852077 1.2097357 0.8138732 0.6155830 CountryMexico CountrySpain CountryUK CountryUS CountryAustralia 0.8469146 0.7244742 0.7739099 0.6381798 1.2830591

The other significant predictor is “appraisal rating” which odds ratio is 0.78. The probability of leaving / probability of staying = 0.78 if the appraisal rating goes up one point. In other words, employees are more likely to stay if they received higher ratings. We can rephrase it in the following ways:

- Individuals who got a “3” performance rating are 1.282 (1/0.78) times more likely to to stay than those who got a “2” performance rating.
- Individuals who got a “4” performance rating are 28.2% more likely to to stay the following year than those who got a “3” performance rating.

The conclusion we can draw is that country differences do not come out as significant in accounting for turnover. However, women are twice more likely to leave than men. Also, a higher appraisal rating will increase the chances of employee staying. Thus, women who received a lower rating have a higher risk of leaving the company than the other employees.

Complete data file and source code in Github