Principal Component Analysis (PCA) and reliability testing with team-level data

While Principal Component Analysis (PCA) and reliability test should be conducted on individual-level responses, this information is usually not available if the survey is conducted by an external survey provider. What we can do first is to apply PCA and reliability test to validate how the survey questions are constructed.

We have a survey result dataset which measures the percentage of team members who answered positively – either “Agree” or “Strongly Agree” on 9 engagement measures in 212 teams.

The 9 questions are as follows:
  • I feel a sense of pride with my organization (Eng1 variable in the dataset).
  • I would recommend this employer to a friend (Eng2 variable in the dataset).
  • I am really engaged (Eng3 variable in the dataset).
  • I can manage my workload (Eng4 variable in the dataset).
  • My work does not interfere with my home life (Eng5 variable in the dataset).
  • I have good work-life balance (Eng6 variable in the dataset).
  • My organization is socially responsible (Eng7 variable in the dataset).
  • My organization makes sure no one gets hurt at the workplace (Eng8 variable in the dataset).
  • My organization is ethical(Eng9 variable in the dataset).

The PCA and the reliability analysis results are below.  The PCA test tells us that the survey questions are clustered nicely in three different measures.

Looking back to the question list, the first three appear to be about work engagement, questions 4– 6 are similar and appear to be about work– life balance, whereas the last three are more about perception of the organization’s ethics. The result indicates that the survey provider defines “employee engagement” from those three perspectives.

The reliability analysis informs us that the Cronbach’s alpha is 0.86 which is above 0.7 threshold. Therefore, the survey result is considered a good level of response variance.

Principal Components Analysis
Call: principal(r = survey.data, nfactors = 3, rotate = "varimax")
Standardized loadings (pattern matrix) based upon correlation matrix
      RC1  RC2   RC3 h2    u2    com
Eng1  0.88 0.13 0.23 0.85 0.1495 1.2
Eng2  0.91 0.15 0.15 0.87 0.1335 1.1
Eng3  0.88 0.21 0.26 0.88 0.1161 1.3
Eng4  0.26 0.84 0.05 0.78 0.2240 1.2
Eng5 -0.03 0.86 0.21 0.78 0.2172 1.1
Eng6  0.28 0.91 0.19 0.94 0.0562 1.3
Eng7  0.28 0.08 0.81 0.73 0.2688 1.3
Eng8  0.10 0.19 0.79 0.67 0.3347 1.2
Eng9  0.24 0.16 0.95 0.99 0.0063 1.2

RC1 RC2 RC3
SS loadings 2.67 2.42 2.40
Proportion Var 0.30 0.27 0.27
Cumulative Var 0.30 0.57 0.83
Proportion Explained 0.36 0.32 0.32
Cumulative Proportion 0.36 0.68 1.00

Mean item complexity = 1.2
Test of the hypothesis that 3 components are sufficient.

The root mean square of the residuals (RMSR) is 0.07 
 with the empirical chi square 64.71 with prob < 3.1e-09

Fit based upon off diagonal values = 0.98
Reliability analysis 
Call: psych::alpha(x = survey.data)

raw_alpha std.alpha G6(smc) average_r S/N  ase   mean   sd
 0.86      0.87      0.96    0.43     6.8  0.015   81    9.6

lower alpha upper 95% confidence boundaries
0.83 0.86 0.89

Reliability if an item is dropped:
 raw_alpha std.alpha G6(smc) average_r S/N alpha se
Eng1 0.84 0.85 0.96 0.42 5.9 0.017
Eng2 0.84 0.86 0.96 0.43 6.0 0.016
Eng3 0.84 0.85 0.95 0.41 5.6 0.017
Eng4 0.85 0.86 0.96 0.44 6.4 0.016
Eng5 0.86 0.87 0.96 0.46 6.8 0.015
Eng6 0.83 0.85 0.93 0.41 5.7 0.019
Eng7 0.85 0.86 0.93 0.44 6.3 0.016
Eng8 0.85 0.87 0.94 0.45 6.5 0.016
Eng9 0.84 0.85 0.90 0.42 5.7 0.017

Item statistics 
 n raw.r std.r r.cor r.drop mean sd
Eng1 212 0.70 0.74 0.71 0.62 88 12
Eng2 212 0.69 0.71 0.68 0.59 88 14
Eng3 212 0.76 0.79 0.77 0.70 82 11
Eng4 212 0.71 0.65 0.62 0.58 61 19
Eng5 212 0.64 0.59 0.55 0.50 77 17
Eng6 212 0.83 0.78 0.78 0.77 71 15
Eng7 212 0.64 0.67 0.68 0.53 84 14
Eng8 212 0.58 0.62 0.63 0.49 90 11
Eng9 212 0.74 0.78 0.79 0.67 87 11

In the next blog, we will look at how we can combine the team-level engagement survey data from the external survey provider with in-house team-level demographic data to predict employee engagement in the organization.

Complete data file and source code in Github

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s