# Predicting Individual Turnover

In this post we will predict the factors that affect the individual employee turnover. Because our response variable is the turnover status, and its value in the dataset is binary type (1=leave, 0=stay), we will apply logistic regression to investigate the data.

The analysis produces three types of information. The first one is called Nagelkerke R-square statistics. Nagelkerke R-square figure is an indicator of the degree of variance in the dependent variable that is accounted for by the variation in our predictor variables. Here it says that from the six predictor variables, we can account for 5.6% of the turnover across the organization. While it seems a small percentage, we need to consider all of the possible causes of people leaving an organization.

```> PseudoR2(logit.ind.tnvr)
3.971238e-02  1.419408e-02  2.973055e-02  5.584941e-02  7.417359e-02
3.120676e-02  NA            NA            1.234203e+03  1.234497e+03```

The second set of figures are the coefficients of the predictor variables. We can determine the significant variables from their p-value. In our case, “gender” and “appraisal rating” are the factors because their p-value are less than 0.05.

```Call:
glm(formula = LeaverStatus ~ ., family = binomial(link = "logit"),
data = emp)

Deviance Residuals:
Min 1Q Median 3Q Max
-1.1041 -0.5538 -0.4593 -0.3760 2.3990

Coefficients:
Estimate      Std. Error    z value   Pr(>|z|)
(Intercept)         0.168371      0.656593      0.256     0.797618
BossGender          -0.183855     0.163275     -1.126    0.260147
Gender              -0.608303     0.161704     -3.762    0.000169 ***
Age                 -0.001347     0.009392     -0.143    0.885962
LengthOfService    -0.017194      0.010900     -1.577    0.114683
AppraisalRating    -0.248002      0.097284     -2.549    0.010795 *
CountryBelgium     -0.249247      0.603715     -0.413    0.679712
CountrySweden       0.001673      0.593760      0.003    0.997752
CountryItaly       -0.058845      0.516308     -0.114    0.909259
CountryFrance      -0.455198      0.624276     -0.729    0.465903
CountryPoland      -0.734433      0.648954     -1.132    0.257753
CountryMexico      -0.415403      0.596774     -0.696    0.486379
CountrySpain       -0.571556      0.535051     -1.068    0.285418
CountryUK          -0.505547      0.470520     -1.074    0.282624
CountryUS          -0.698382      0.433822     -1.610    0.107433
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 1254.0 on 1649 degrees of freedom
Residual deviance: 1204.2 on 1635 degrees of freedom
(3 observations deleted due to missingness)
AIC: 1234.2

Number of Fisher Scoring iterations: 5```

The third statistic figure it produces is odds ratio (the exp() function below). Odds ratio is the ratio of the odds of the event occurring to it not occurring. In the turnover example, the odds ratio is the likelihood or odds of leaver status = 1 if the value of a predictor variable is increased by one unit.

For example, with “Gender”, the odds ratio is 0.544. It means that the probability of leaving / probability of staying = 0.544 if the gender = 1 (men). Because the odds ratio is less than 1, we know that the probability of staying is greater than the probability of leaving if the employee is man. On the other hand, women are more likely to leave. We could present this in a number of ways:

• The odds of men leaving the organization is 0.544 to 1.
• The odds of men staying the organization is 1.838 (1/0.544) to 1.
• Women are 1.838 times more likely to leave the organization than men.
```> exp(logit.ind.tnvr\$coefficients)
(Intercept)         BossGender    Gender        Age            LengthOfService
0.9223082           0.8320562     0.5442737     0.9986539      0.9829526
AppraisalRating     CountrySweden CountryItaly  CountryFrance  CountryPoland
0.7803582           1.2852077     1.2097357     0.8138732      0.6155830
CountryMexico       CountrySpain  CountryUK     CountryUS      CountryAustralia
0.8469146           0.7244742     0.7739099     0.6381798      1.2830591```

The other significant predictor is “appraisal rating” which odds ratio is 0.78. The probability of leaving / probability of staying = 0.78 if the appraisal rating goes up one point. In other words, employees are more likely to stay if they received higher ratings. We can rephrase it in the following ways:

• Individuals who got a “3” performance rating are 1.282 (1/0.78) times more likely to to stay than those who got a “2” performance rating.
• Individuals who got a “4” performance rating are 28.2% more likely to to stay the following year than those who got a “3” performance rating.

The conclusion we can draw is that country differences do not come out as significant in accounting for turnover. However, women are twice more likely to leave than men. Also, a higher appraisal rating will increase the chances of employee staying. Thus, women who received a lower rating have a higher risk of leaving the company than the other employees.

Complete data file and source code in Github

This site uses Akismet to reduce spam. Learn how your comment data is processed.