Over a half century after transformative civil rights laws such as Title VII of the Civil Rights Act of 1964 made discrimination illegal, America is still grappling with its history of racial injustice and the profound ongoing impact of systemic discrimination.
Given that the primary responsibility for enforcing anti-discrimination laws lies with the individual workers, who must file complaints with their employer or a government agency, this could potentially deter employees from reporting unequal treatment based on color and gender due to fear of retaliation, hurdles in proving discrimination factually and etc.
In this capstone project, we seek to find out if implicit or perceived discrimination exist in workplaces using an established annual government survey data to explore if ethnicity and gender affect the degree of supervisory support for employees working in the federal government.
In our exploratory analysis of how the degree of supervisory support differs across races and gender, we choose the following indicators which are highly valued by employees consistently across organizations:
Supervisor’s support for work-life balance - which plays a significant role in ensuring employees are able to both their work and family responsibilities.
Supervisor’s support for employee development - which is essential for improvements in productivity and worker’s engagement at work
For our data science project, we activated the following packages, using both the Tidyverse
and Base R
approach.
::p_load(dplyr, tidyverse, lubridate,
pacman
modelr, broom,
rvest,
MASS, Hmisc, car, psych,
ggthemes, scales, ggfortify,
jtools, huxtable, interactions,
DT, ggstance,
knitr, kableExtra, effects, table1)
The dataset for this project originates from the Office of Personnel Management’s Federal Employee Viewpoint Survey. It includes 120 perception questions with responses range from 1 = Strongly Disagree to 5 = Strongly Agree. Out of 1,410,610 employees, a total of 624,800 employees completed this survey which accounts for a response rate of 44.3%.
<- read_csv("FEVS_2020_PRDF.csv") data
During the survey rollout, federal employees were asked to rate their interactions and interpersonal relationship with their supervisors and perception on how much support they receive from their supervisors. Two out of the seven questions in this section (Q19
& Q21
) will serve as our outcome variables.
Q19. My supervisor supports my need to balance work and other life issues.
Q21. Supervisors in my work unit support employee development.
Also, the dataset contains demographic response (8 items). Respondents were asked by the following questions:
1. Please select the racial category or categories with which you most closely identify.
A. Black or African American
B. White
C. Asian
D. Other Groups Collapsed for Privacy2. Are you of Hispanic, Latino, or Spanish origin?
A. Yes
B. No3. Are you an individual with a disability?
A. Yes
B. No4. What is your age group?
A. Under 40
B. 40 or Older5. What is your supervisory status?
A. Non-Supervisor/Team Leader
B. Supervisor/Manager/Executive6. How long have you been with the Federal Government (excluding military service)?
A. Ten years or fewer
B. Eleven to 20 years
C. More than 20 years7. Are you Male or Female?
A. Male
B. Female8. What is your US military service status?
A. Military Service
B. No Prior Military Service
Out of the 8 questions, DRNO
(categorical) will serve as our key predictor variable; whereas the other five (excluding military and supervisory status) will be used as control variables. The binary variable gender
is used as our moderator.
We start with checking the values of Cronbach’s Alpha for the selected groups of questions that we wish to create new predictor variables from.
Given that the values of alpha exceed the threshold of 0.8 for the above groups, we subsequently created two variables (work_experience
and work_satisfaction
) that contain the mean value of the items for each variable.
Also, we recoded the gender
variable - 0 being Female, 1 being Male. White American is set as the reference group for the discussion of results.
For the purpose of constructing the logistic regression models, we also created a binary outcome variable for each survey question of interest (Q19
& Q21
), where higher scores above 3 point towards employee’s agreement with the statement that they felt supported by the supervisors for work-life balance and employee development, while scores 3 and below indicate otherwise.
<- data %>%
data_cleaned mutate(work_satisfaction = rowMeans(data %>%
::select(Q33, Q34, Q35, Q36, Q36, Q37, Q38), na.rm = TRUE),
dplyrwork_experience = rowMeans(data %>%
::select(Q1, Q2, Q3, Q4, Q5, Q6, Q7, Q8), na.rm = TRUE),
dplyrsupervisor_support = rowMeans(data %>%
::select(Q19, Q20, Q21, Q22, Q23, Q24, Q25), na.rm = TRUE),
dplyrleadership = rowMeans(data %>%
::select(Q26, Q27, Q28, Q29, Q30, Q31, Q32), na.rm = TRUE),
dplyrgender = ifelse(DSEX == "A", 1 , 0),
ancestry = ifelse(DHISP == "A", 1, 0),
disability = ifelse(DDIS == "A", 1, 0),
above40 = ifelse(DAGEGRP == "B", 1, 0),
Q19_sqrt = sqrt(max(Q19 + 1) - Q19),
Q19_log = log10(max(Q19 + 1) - Q19),
Q19_inverse = 1/(max(Q19 + 1) - Q19),
Q19_binary = ifelse(Q19 > 3, 1, 0),
Q21_binary = ifelse(Q21 > 3, 1, 0),
)
$DRNO <- factor(data_cleaned$DRNO,
data_cleanedlevels=c("A", "B", "C", "D"),
labels=c("Black or African American",
"White",
"Asian",
"Others"))
$DRNO <- relevel(as.factor(data_cleaned$DRNO), "White") data_cleaned
White (N=363512) |
Black or African American (N=71909) |
Asian (N=28633) |
Others (N=32676) |
Total (N=496730) |
|
---|---|---|---|---|---|
Supervisor Support for Work-Life Balance (Binary) | |||||
Mean (SD) | 0.882 (0.323) | 0.853 (0.354) | 0.874 (0.332) | 0.802 (0.399) | 0.872 (0.334) |
Median [Min, Max] | 1.00 [0, 1.00] | 1.00 [0, 1.00] | 1.00 [0, 1.00] | 1.00 [0, 1.00] | 1.00 [0, 1.00] |
Supervisor Support for Employee Development (Binary) | |||||
Mean (SD) | 0.823 (0.382) | 0.788 (0.409) | 0.827 (0.378) | 0.723 (0.447) | 0.811 (0.391) |
Median [Min, Max] | 1.00 [0, 1.00] | 1.00 [0, 1.00] | 1.00 [0, 1.00] | 1.00 [0, 1.00] | 1.00 [0, 1.00] |
Federal Tenure (Years) | |||||
10 or fewer | 131408 (36.1%) | 25396 (35.3%) | 12925 (45.1%) | 13434 (41.1%) | 183163 (36.9%) |
Between 11 and 20 | 135865 (37.4%) | 24462 (34.0%) | 10305 (36.0%) | 11767 (36.0%) | 182399 (36.7%) |
More than 20 | 96239 (26.5%) | 22051 (30.7%) | 5403 (18.9%) | 7475 (22.9%) | 131168 (26.4%) |
Gender | |||||
Male | 149039 (41.0%) | 43965 (61.1%) | 13120 (45.8%) | 16258 (49.8%) | 222382 (44.8%) |
Female | 214473 (59.0%) | 27944 (38.9%) | 15513 (54.2%) | 16418 (50.2%) | 274348 (55.2%) |
Ancestry (Hispanic, Latino, Spanish) (Binary) | |||||
No | 325735 (89.6%) | 69718 (97.0%) | 28464 (99.4%) | 26951 (82.5%) | 450868 (90.8%) |
Yes | 37777 (10.4%) | 2191 (3.0%) | 169 (0.6%) | 5725 (17.5%) | 45862 (9.2%) |
Disability (Binary) | |||||
No | 312698 (86.0%) | 58819 (81.8%) | 26973 (94.2%) | 26784 (82.0%) | 425274 (85.6%) |
Yes | 50814 (14.0%) | 13090 (18.2%) | 1660 (5.8%) | 5892 (18.0%) | 71456 (14.4%) |
Above 40 Years Old (Binary) | |||||
No | 85621 (23.6%) | 13013 (18.1%) | 7370 (25.7%) | 8470 (25.9%) | 114474 (23.0%) |
Yes | 277891 (76.4%) | 58896 (81.9%) | 21263 (74.3%) | 24206 (74.1%) | 382256 (77.0%) |
Work Satisfaction | |||||
Mean (SD) | 3.71 (0.877) | 3.70 (0.887) | 3.78 (0.826) | 3.51 (0.928) | 3.70 (0.881) |
Median [Min, Max] | 3.83 [1.00, 5.00] | 3.83 [1.00, 5.00] | 3.83 [1.00, 5.00] | 3.67 [1.00, 5.00] | 3.83 [1.00, 5.00] |
Missing | 120 (0.0%) | 45 (0.1%) | 17 (0.1%) | 17 (0.1%) | 199 (0.0%) |
Work Experience | |||||
Mean (SD) | 3.90 (0.783) | 3.89 (0.805) | 3.99 (0.731) | 3.74 (0.848) | 3.89 (0.789) |
Median [Min, Max] | 4.00 [1.00, 5.00] | 4.00 [1.00, 5.00] | 4.00 [1.00, 5.00] | 3.88 [1.00, 5.00] | 4.00 [1.00, 5.00] |
Missing | 69 (0.0%) | 25 (0.0%) | 6 (0.0%) | 6 (0.0%) | 106 (0.0%) |
In addition, the outcome variables (Q19
& Q21
) are plotted in histograms to visualize the distributions within each ethnicity.
(Note: The extreme left-skewness of the outcome variables render multiple linear regression model unsuitable, even after attempts to transform the outcome variables as the condition of skewness is not satisfied. Hence, this leads us to model the data using logistic and parameterized ordinal logistic regression.)
For the preparation of the model, we created and ran a correlation matrix, to see how our variables of interest (within the model) are related.
%>%
data_cleaned ::select(gender, ancestry, disability, above40, work_experience, work_satisfaction, Q19, Q21) %>%
dplyras.matrix(.) %>%
::rcorr(.) %>%
Hmisc::tidy(.) %>%
broomrename(`Variable 1` = column1,
`Variable 2` = column2,
Correlation = estimate) %>%
mutate(Abs_correlation = abs(Correlation)
%>%
) ::datatable(options = list(scrollX = T),
DT%>%
) formatRound(columns = c("Correlation", "p.value", "Abs_correlation"),
digits = 3)
For this project, 2 different regression models are explored. The first part of this project focuses on the logistic regression model which we regressed the demographics and control variables onto the binary outcomes of the likelihood of receiving supervisor’s support on:
(1) work-life balance (Q19_binary
) and (2) employee development (Q21_binary
).
The second part of this project takes into account the ordinal ratings of the response variables. Hence, we use the parameterized ordinal logistic regression which regresses the same predictors onto the ordinal outcomes of receiving supervisor’s support on: (1) work-life balance (Q19
) and (2) employee development (Q21
).
For each model, we will first show the model equation, followed by the regression output.
\[ \begin{aligned} \operatorname{Q19\_binary} &\sim Bernoulli\left(\operatorname{prob}_{\operatorname{Q19\_binary} = \operatorname{1}}= \hat{P}\right) \\ \log\left[ \frac { \hat{P} }{ 1 - \hat{P} } \right] &= \alpha + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience}) + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction}) + \epsilon \end{aligned} \]
<- final_cleaned_data %>%
model4_logit glm(Q19_binary ~ DRNO + gender + ancestry + disability + above40 + DFEDTEN + work_experience
+ work_satisfaction,
family = binomial(link = "logit"))
.,
summary(model4_logit)
##
## Call:
## glm(formula = Q19_binary ~ DRNO + gender + ancestry + disability +
## above40 + DFEDTEN + work_experience + work_satisfaction,
## family = binomial(link = "logit"), data = .)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.1697 0.1667 0.2939 0.4369 2.5054
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.922664 0.025169 -155.851 < 2e-16 ***
## DRNOBlack or African American -0.316700 0.014018 -22.592 < 2e-16 ***
## DRNOAsian -0.391539 0.021487 -18.222 < 2e-16 ***
## DRNOOthers -0.405717 0.017687 -22.938 < 2e-16 ***
## gender1 0.085865 0.010143 8.466 < 2e-16 ***
## ancestry -0.445471 0.015777 -28.235 < 2e-16 ***
## disability -0.107923 0.013512 -7.987 1.38e-15 ***
## above40 -0.152872 0.013233 -11.552 < 2e-16 ***
## DFEDTENBetween 11 and 20 -0.030155 0.012132 -2.486 0.0129 *
## DFEDTENMore than 20 -0.002496 0.014330 -0.174 0.8617
## work_experience 0.889674 0.009438 94.269 < 2e-16 ***
## work_satisfaction 0.887080 0.008831 100.455 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 379657 on 495528 degrees of freedom
## Residual deviance: 277406 on 495517 degrees of freedom
## AIC: 277430
##
## Number of Fisher Scoring iterations: 6
\[ \begin{aligned} \operatorname{Q21\_binary} &\sim Bernoulli\left(\operatorname{prob}_{\operatorname{Q21\_binary} = \operatorname{1}}= \hat{P}\right) \\ \log\left[ \frac { \hat{P} }{ 1 - \hat{P} } \right] &= \alpha + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience})\ + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction}) + \epsilon \end{aligned} \]
<- final_cleaned_data %>%
model6_logit glm(Q21_binary ~ DRNO + gender + ancestry + disability + above40 + DFEDTEN + work_experience
+ work_satisfaction,
family = binomial(link = "logit"))
.,
summary(model6_logit)
##
## Call:
## glm(formula = Q21_binary ~ DRNO + gender + ancestry + disability +
## above40 + DFEDTEN + work_experience + work_satisfaction,
## family = binomial(link = "logit"), data = .)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.1966 0.1426 0.3000 0.4855 3.0867
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.932296 0.026931 -220.279 < 2e-16 ***
## DRNOBlack or African American -0.305987 0.012976 -23.581 < 2e-16 ***
## DRNOAsian -0.302837 0.019960 -15.172 < 2e-16 ***
## DRNOOthers -0.392240 0.016828 -23.309 < 2e-16 ***
## gender1 0.098436 0.009325 10.556 < 2e-16 ***
## ancestry -0.420589 0.014941 -28.149 < 2e-16 ***
## disability -0.159582 0.012600 -12.666 < 2e-16 ***
## above40 -0.296497 0.012267 -24.170 < 2e-16 ***
## DFEDTENBetween 11 and 20 -0.019117 0.011138 -1.716 0.0861 .
## DFEDTENMore than 20 0.058096 0.013110 4.431 9.36e-06 ***
## work_experience 1.206585 0.009274 130.110 < 2e-16 ***
## work_satisfaction 1.000478 0.008310 120.392 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 479779 on 495528 degrees of freedom
## Residual deviance: 316893 on 495517 degrees of freedom
## AIC: 316917
##
## Number of Fisher Scoring iterations: 6
In addition to exploring the differences in supervisory support for work-life balance and employee development, literature has shown the persistence of workplace gender segregation in the United States. Hence, we include a set of interaction terms for each ethnicity group with the binary gender variable in attempt to draw more insights from our data.
\[ \begin{aligned} \operatorname{Q19\_binary} &\sim Bernoulli\left(\operatorname{prob}_{\operatorname{Q19\_binary} = \operatorname{1}}= \hat{P}\right) \\ \log\left[ \frac { \hat{P} }{ 1 - \hat{P} } \right] &= \alpha + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}_{\operatorname{1}}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience})\ + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction})\ + \beta_{12}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}} \times \operatorname{gender}_{\operatorname{1}})\ + \\ &\quad \beta_{13}(\operatorname{DRNO}_{\operatorname{Asian}} \times \operatorname{gender}_{\operatorname{1}}) + \beta_{14}(\operatorname{DRNO}_{\operatorname{Others}} \times \operatorname{gender}_{\operatorname{1}}) + \epsilon \end{aligned} \]
<- final_cleaned_data %>%
model10_logit glm(Q19_binary ~ DRNO + gender + DRNO*gender + ancestry + disability + above40 + DFEDTEN
+ work_experience + work_satisfaction,
family = binomial(link = "logit"))
.,
summary(model10_logit)
##
## Call:
## glm(formula = Q19_binary ~ DRNO + gender + DRNO * gender + ancestry +
## disability + above40 + DFEDTEN + work_experience + work_satisfaction,
## family = binomial(link = "logit"), data = .)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.1692 0.1668 0.2939 0.4369 2.4580
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.920903 0.025540 -153.517 < 2e-16
## DRNOBlack or African American -0.292063 0.018056 -16.175 < 2e-16
## DRNOAsian -0.358796 0.031148 -11.519 < 2e-16
## DRNOOthers -0.525128 0.024434 -21.492 < 2e-16
## gender1 0.078245 0.012122 6.455 1.08e-10
## ancestry -0.448058 0.015782 -28.390 < 2e-16
## disability -0.108655 0.013516 -8.039 9.05e-16
## above40 -0.152209 0.013237 -11.499 < 2e-16
## DFEDTENBetween 11 and 20 -0.031006 0.012135 -2.555 0.01062
## DFEDTENMore than 20 -0.003816 0.014343 -0.266 0.79022
## work_experience 0.890307 0.009440 94.317 < 2e-16
## work_satisfaction 0.887473 0.008831 100.491 < 2e-16
## DRNOBlack or African American:gender1 -0.074954 0.028480 -2.632 0.00849
## DRNOAsian:gender1 -0.065146 0.042646 -1.528 0.12661
## DRNOOthers:gender1 0.250279 0.035395 7.071 1.54e-12
##
## (Intercept) ***
## DRNOBlack or African American ***
## DRNOAsian ***
## DRNOOthers ***
## gender1 ***
## ancestry ***
## disability ***
## above40 ***
## DFEDTENBetween 11 and 20 *
## DFEDTENMore than 20
## work_experience ***
## work_satisfaction ***
## DRNOBlack or African American:gender1 **
## DRNOAsian:gender1
## DRNOOthers:gender1 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 379657 on 495528 degrees of freedom
## Residual deviance: 277339 on 495514 degrees of freedom
## AIC: 277369
##
## Number of Fisher Scoring iterations: 6
\[ \begin{aligned} \operatorname{Q21\_binary} &\sim Bernoulli\left(\operatorname{prob}_{\operatorname{Q21\_binary} = \operatorname{1}}= \hat{P}\right) \\ \log\left[ \frac { \hat{P} }{ 1 - \hat{P} } \right] &= \alpha + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}_{\operatorname{1}}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience}) + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction})\ + \beta_{12}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}} \times \operatorname{gender}_{\operatorname{1}})\ + \\ &\quad \beta_{13}(\operatorname{DRNO}_{\operatorname{Asian}} \times \operatorname{gender}_{\operatorname{1}}) + \beta_{14}(\operatorname{DRNO}_{\operatorname{Others}} \times \operatorname{gender}_{\operatorname{1}}) + \epsilon \end{aligned} \]
<- final_cleaned_data %>%
model12_logit glm(Q21_binary ~ DRNO + gender + DRNO*gender + ancestry + disability + above40 + DFEDTEN
+ work_experience + work_satisfaction,
family = binomial(link = "logit"))
.,
summary(model12_logit)
##
## Call:
## glm(formula = Q21_binary ~ DRNO + gender + DRNO * gender + ancestry +
## disability + above40 + DFEDTEN + work_experience + work_satisfaction,
## family = binomial(link = "logit"), data = .)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.2005 0.1431 0.2997 0.4850 3.0523
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.920645 0.027199 -217.682 < 2e-16
## DRNOBlack or African American -0.320582 0.016658 -19.245 < 2e-16
## DRNOAsian -0.280925 0.028869 -9.731 < 2e-16
## DRNOOthers -0.502941 0.023338 -21.551 < 2e-16
## gender1 0.077761 0.011055 7.034 2.01e-12
## ancestry -0.422987 0.014945 -28.302 < 2e-16
## disability -0.161012 0.012603 -12.776 < 2e-16
## above40 -0.296524 0.012271 -24.165 < 2e-16
## DFEDTENBetween 11 and 20 -0.019171 0.011141 -1.721 0.0853
## DFEDTENMore than 20 0.058874 0.013122 4.487 7.23e-06
## work_experience 1.206680 0.009275 130.093 < 2e-16
## work_satisfaction 1.000578 0.008311 120.397 < 2e-16
## DRNOBlack or African American:gender1 0.027606 0.026455 1.044 0.2967
## DRNOAsian:gender1 -0.045309 0.039636 -1.143 0.2530
## DRNOOthers:gender1 0.228625 0.033658 6.793 1.10e-11
##
## (Intercept) ***
## DRNOBlack or African American ***
## DRNOAsian ***
## DRNOOthers ***
## gender1 ***
## ancestry ***
## disability ***
## above40 ***
## DFEDTENBetween 11 and 20 .
## DFEDTENMore than 20 ***
## work_experience ***
## work_satisfaction ***
## DRNOBlack or African American:gender1
## DRNOAsian:gender1
## DRNOOthers:gender1 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 479779 on 495528 degrees of freedom
## Residual deviance: 316843 on 495514 degrees of freedom
## AIC: 316873
##
## Number of Fisher Scoring iterations: 6
In this section, we will interpret the results of our 4 models specified in Chapter 5A showing the relationship between (1) ethnicity and (2) interaction effects of gender, and the binary outcomes of the likelihood of receiving supervisor’s support on:
Q19_binary
) andQ21_binary
).Model 1 Main Effects - Work-Life Balance | Model 3 With Interactions - Work-Life Balance | Model 2 Main Effects - Employee's Development | Model 4 With Interactions - Employee's Development | |
---|---|---|---|---|
Intercept | 0.020 *** | 0.020 *** | 0.003 *** | 0.003 *** |
([0.019, 0.021], p = 0.000) | ([0.019, 0.021], p = 0.000) | ([0.003, 0.003], p = 0.000) | ([0.003, 0.003], p = 0.000) | |
Black or African American | 0.729 *** | 0.747 *** | 0.736 *** | 0.726 *** |
([0.709, 0.749], p = 0.000) | ([0.721, 0.774], p = 0.000) | ([0.718, 0.755], p = 0.000) | ([0.702, 0.750], p = 0.000) | |
Asian | 0.676 *** | 0.699 *** | 0.739 *** | 0.755 *** |
([0.648, 0.705], p = 0.000) | ([0.657, 0.742], p = 0.000) | ([0.710, 0.768], p = 0.000) | ([0.714, 0.799], p = 0.000) | |
Others | 0.666 *** | 0.591 *** | 0.676 *** | 0.605 *** |
([0.644, 0.690], p = 0.000) | ([0.564, 0.620], p = 0.000) | ([0.654, 0.698], p = 0.000) | ([0.578, 0.633], p = 0.000) | |
Gender (0 = Female, 1 = Male) | 1.090 *** | 1.081 *** | 1.103 *** | 1.081 *** |
([1.068, 1.112], p = 0.000) | ([1.056, 1.107], p = 0.000) | ([1.083, 1.124], p = 0.000) | ([1.058, 1.105], p = 0.000) | |
Ancestry | 0.641 *** | 0.639 *** | 0.657 *** | 0.655 *** |
([0.621, 0.661], p = 0.000) | ([0.619, 0.659], p = 0.000) | ([0.638, 0.676], p = 0.000) | ([0.636, 0.675], p = 0.000) | |
Disability | 0.898 *** | 0.897 *** | 0.853 *** | 0.851 *** |
([0.874, 0.922], p = 0.000) | ([0.874, 0.921], p = 0.000) | ([0.832, 0.874], p = 0.000) | ([0.831, 0.873], p = 0.000) | |
Age above 40 years old (0 = No, 1 = Yes) | 0.858 *** | 0.859 *** | 0.743 *** | 0.743 *** |
([0.836, 0.881], p = 0.000) | ([0.837, 0.881], p = 0.000) | ([0.726, 0.762], p = 0.000) | ([0.726, 0.761], p = 0.000) | |
Federal Tenure between 11 and 20 years | 0.970 * | 0.969 * | 0.981 | 0.981 |
([0.947, 0.994], p = 0.013) | ([0.947, 0.993], p = 0.011) | ([0.960, 1.003], p = 0.086) | ([0.960, 1.003], p = 0.085) | |
Federal Tenure more than 20 years | 0.998 | 0.996 | 1.060 *** | 1.061 *** |
([0.970, 1.026], p = 0.862) | ([0.969, 1.025], p = 0.790) | ([1.033, 1.087], p = 0.000) | ([1.034, 1.088], p = 0.000) | |
Work Experience Rating | 2.434 *** | 2.436 *** | 3.342 *** | 3.342 *** |
([2.390, 2.480], p = 0.000) | ([2.391, 2.481], p = 0.000) | ([3.282, 3.403], p = 0.000) | ([3.282, 3.404], p = 0.000) | |
Work Satisfaction Rating | 2.428 *** | 2.429 *** | 2.720 *** | 2.720 *** |
([2.386, 2.470], p = 0.000) | ([2.387, 2.471], p = 0.000) | ([2.676, 2.764], p = 0.000) | ([2.676, 2.765], p = 0.000) | |
Black or African American X Gender | 0.928 ** | 1.028 | ||
([0.877, 0.981], p = 0.008) | ([0.976, 1.083], p = 0.297) | |||
Asian X Gender | 0.937 | 0.956 | ||
([0.862, 1.019], p = 0.127) | ([0.884, 1.033], p = 0.253) | |||
Others X Gender | 1.284 *** | 1.257 *** | ||
([1.198, 1.377], p = 0.000) | ([1.177, 1.343], p = 0.000) | |||
N | 495529 | 495529 | 495529 | 495529 |
AIC | 277429.956 | 277368.838 | 316916.868 | 316873.448 |
BIC | 277563.317 | 277535.538 | 317050.229 | 317040.149 |
Pseudo R2 | 0.348 | 0.349 | 0.452 | 0.452 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
The above regression table shows the exponentiated coefficients for all variables, which we will give our interpretation based on the odds scale, instead of on the probability scale.
Based on the logistic regression Model 1
and Model 2
, we observe that non-White employees (Black/African Americans
, Asian
and Others
) less likely than White employees to agree with the statement that their supervisors support their need for work-life balance and employee development.
In particular, as seen from Model 1
, Black or African American federal employees see about 27% decrease in odds as compared to their White colleagues in receiving support from the supervisors to balance their work and other life priorities, holding other variables constant. Employees who are Asian and Others
racial categories see a greater decrease in odds, by 32% and 33% respectively, relative to White employees.
As for Model 2
, the odds for Black or African American and Asian federal employees in receiving support from the supervisors for their development are approximately 26% lower than their White colleagues, holding other variables constant. The odds for employees of Others
racial category decreases more significantly by 32% as compared to their White colleagues.
When considering the gender interaction effects in Model 3
, the odds for White male employees receiving support from their supervisors to balance work and other life priorities are 8% higher than White females employees. On the contrary, within the groups of Asian and Black/African American employees, the odds for male employees gaining support from supervisors for work-life balance decreases slightly by 6-7% relative to their female counterparts. Surprisingly for employees of Others
racial category, we observe that the odds for male employees gaining support for work-life balance is 28% higher than their female colleagues, which reflects the widest gap in differentiation for the minority as compared to the other 3 racial groups.
Lastly for Model 4, the odds for White male employees receiving support from their supervisors for employee’s development are 8% higher than White females employees. However, within the groups of Asian and Black/African American employees, the odds for male employees gaining support from supervisors development opportunities do not differ significantly from females. For employees of Others
racial category, the odds for male employees gaining support for development opportunities is 26% higher than their female colleagues, which again reflects the widest gap in differentiation for the minority as compared to the other 3 racial groups.
For ease of comparing the magnitudes of the effects of other independent variables on the outcome variables, we also plotted the raw regression coefficients in the following graphs. Estimates above 0 implies an increase in probability/odds (i.e. positive relationship) in receiving support for work-life balance or employee’s development due to the variable, while those below 0 indicate otherwise.
In terms of the goodness of fit, the results of the analysis suggest that adding the interaction terms has negligible effect in enhancing the explanatory power of the model as the pseudo R-squared of Model 1
and Model 2
were similar to that of Models 3
and Model 4
respectively.
For the purpose of investigating how the degree of supervisory support for employees in the federal government differ across ethnicity and gender, we will keep the interaction variables in our models.
To see the patterns of interaction, we will visualize the significant interaction effects on the next chapter.
To visualize the logistic regression analysis performed above, we plotted the predicted probabilities across the ethnicity groups. Within each ethnicity group, the predicted probabilities are further differentiated by gender in the later section.
Based on the above figures in sub-section 6A, it has shown consistently that the predicted probabilities of White employees to receive support from supervisor on both work-life balance and employee development are much higher than other racial groups. Differentiated treatment due to racial groups appears to exist in the federal workplaces, where employees in Others
racial category tend to be at relatively greater disadvantage.
In the second part of the logistic regression model, we will include the gender interaction effects to see how the relationship between ethnicity and supervisor support for work-life balance and employee’s development may change.
Based on the figure above, the predicted probability of White male employees in receiving support from supervisor to pursue work-life balance are higher than their White female colleagues. Similar conclusion can be drawn for employees in Others
racial category with a wider disparity in predicted probabilities.
Comparing across all racial groups, White male employees are most likely to receive support from their supervisors to balance work and other life priorities, and least likely so for females in Others
racial category.
Gender effects are not significant within Black or African American and Asian employees.
We also observe similar relationship when the outcome variable is the supervisor’s support for employee’s development (Q21_binary
). The exception lies in that male Black or African American employees have significantly higher predicted probability in receiving support for their development than their female colleagues.
In summary, we can observe that White male employees have consistently higher predicted probabilities of receiving support from supervisors on work-life balance and employee development, relative to males in the other 3 racial groups. No gender difference is observed for Asians, who are relatively similar in their outcomes due to overlapping confidence intervals. Females of Others
racial category have notably the lowest predicted probability amongst non-White employees.
Given that the ratings of the response variables of interest are on a 5-point Likert scale, we decided to do an extension of the logistic regression model in Part 1 by modelling the relationship using the ordinal logistic regression.
The scale of the response variables of interest (Q19
and Q21
) ranges from: 5 - Strongly Agree, 4 - Agree, 3 - Neither Agree nor Disagree, 2 - Disagree, 1 - Strongly Disagree
For ordinal logistic regression, the model assumes that none of independent variables has a disproportionate effect on any rating level of the outcome variable, which can be tested by the Brant
package. Besides the proportional odds assumption, it also assumes the absence of multicollinearity where independent variables are not significantly correlated as shown by the Variance Inflation Factor (VIF). These assumptions will be addressed in Chapter 11 and the Appendix.
For the purpose of this project, we will run our regression on a smaller random sample of the full dataset due to the high computational demand of running the full dataset.
set.seed(110721)
<- final_cleaned_data %>%
final_cleaned_data_sample sample_frac(.20)
Similar to the logistic regression models in Chapter 5, For each model, we will first show the model equations for parameterized ordinal logistic regression, followed by the regression output of each model.
Q19
and Q21
are the outcome variables for Model 5
and Model 6
respectively, while all other predictor variables are the same for both models.
The model equations for Model 5
and Model 6
are as follow:
\[ \begin{aligned} \log\left[ \frac { P( \operatorname{1} \geq \operatorname{2} ) }{ 1 - P( \operatorname{1} \geq \operatorname{2} ) } \right] &= \alpha_{1} + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience}) + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction}) + \epsilon \\ \log\left[ \frac { P( \operatorname{2} \geq \operatorname{3} ) }{ 1 - P( \operatorname{2} \geq \operatorname{3} ) } \right] &= \alpha_{2} + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience}) + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction}) + \epsilon \\ \log\left[ \frac { P( \operatorname{3} \geq \operatorname{4} ) }{ 1 - P( \operatorname{3} \geq \operatorname{4} ) } \right] &= \alpha_{3} + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience}) + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction}) + \epsilon \\ \log\left[ \frac { P( \operatorname{4} \geq \operatorname{5} ) }{ 1 - P( \operatorname{4} \geq \operatorname{5} ) } \right] &= \alpha_{4} + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience}) + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction}) + \epsilon \end{aligned} \]
## Call:
## polr(formula = Q19_factor ~ DRNO + gender + ancestry + disability +
## above40 + DFEDTEN + work_experience + work_satisfaction,
## data = final_cleaned_data_sample, Hess = T)
##
## Coefficients:
## Value Std. Error t value
## DRNOBlack or African American -0.26319 0.01946 -13.5233
## DRNOAsian -0.46271 0.02853 -16.2157
## DRNOOthers -0.28775 0.02657 -10.8295
## gender1 -0.01342 0.01381 -0.9714
## ancestry -0.33118 0.02308 -14.3498
## disability -0.02516 0.01945 -1.2936
## above40 -0.22011 0.01822 -12.0806
## DFEDTENBetween 11 and 20 -0.07496 0.01660 -4.5154
## DFEDTENMore than 20 -0.07814 0.01915 -4.0811
## work_experience 1.11114 0.01498 74.1983
## work_satisfaction 0.76837 0.01327 57.8893
##
## Intercepts:
## Value Std. Error t value
## 1|2 2.2064 0.0406 54.3250
## 2|3 3.1860 0.0396 80.4671
## 3|4 4.2812 0.0402 106.5428
## 4|5 6.7203 0.0440 152.6055
##
## Residual Deviance: 175465.67
## AIC: 175495.67
Value | Std. Error | t value | p value | |
---|---|---|---|---|
DRNOBlack or African American | -0.2631899 | 0.0194620 | -13.5232587 | 0.0000000 |
DRNOAsian | -0.4627097 | 0.0285347 | -16.2156671 | 0.0000000 |
DRNOOthers | -0.2877461 | 0.0265707 | -10.8294660 | 0.0000000 |
gender1 | -0.0134190 | 0.0138146 | -0.9713577 | 0.3313702 |
ancestry | -0.3311809 | 0.0230792 | -14.3497575 | 0.0000000 |
disability | -0.0251645 | 0.0194530 | -1.2936020 | 0.1958029 |
above40 | -0.2201116 | 0.0182202 | -12.0806041 | 0.0000000 |
DFEDTENBetween 11 and 20 | -0.0749623 | 0.0166013 | -4.5154464 | 0.0000063 |
DFEDTENMore than 20 | -0.0781408 | 0.0191468 | -4.0811380 | 0.0000448 |
work_experience | 1.1111374 | 0.0149752 | 74.1983470 | 0.0000000 |
work_satisfaction | 0.7683657 | 0.0132730 | 57.8893371 | 0.0000000 |
1|2 | 2.2063853 | 0.0406146 | 54.3249936 | 0.0000000 |
2|3 | 3.1859674 | 0.0395934 | 80.4670665 | 0.0000000 |
3|4 | 4.2811604 | 0.0401826 | 106.5427517 | 0.0000000 |
4|5 | 6.7203044 | 0.0440371 | 152.6054774 | 0.0000000 |
As seen from Model 5
regression coefficients table, the output shows that for employees who are Black/African American, Asian and Other races, the log odds of increasing the scale rating for supervisor’s support for work-life balance by 1 unit decreases by 0.26, 0.46 and 0.29 points as compared to their White colleagues respectively. Collectively among all respondents, males and females are equally likely to increase or decrease the scale rating for their supervisors.
odds_ratio | 2.5 % | 97.5 % | |
---|---|---|---|
DRNOBlack or African American | 0.7685959 | 0.7400658 | 0.7982285 |
DRNOAsian | 0.6295754 | 0.5956470 | 0.6654981 |
DRNOOthers | 0.7499520 | 0.7122602 | 0.7897167 |
gender1 | 0.9866707 | 0.9604328 | 1.0135598 |
ancestry | 0.7180753 | 0.6866128 | 0.7510433 |
disability | 0.9751495 | 0.9389850 | 1.0127761 |
above40 | 0.8024292 | 0.7742700 | 0.8315889 |
DFEDTENBetween 11 and 20 | 0.9277785 | 0.8982854 | 0.9582662 |
DFEDTENMore than 20 | 0.9248342 | 0.8916645 | 0.9601560 |
work_experience | 3.0378117 | 2.9500166 | 3.1283724 |
work_satisfaction | 2.1562395 | 2.1009141 | 2.2130137 |
Another interpretation of the regression results is shown in above table expressed in odds ratios. For employees who are Black/African American, Asian and of Other races, the odds of being in a higher level rating for supervisor’s support for work-life balance by 1 unit decreases by 23%, 37% and 25% respectively relative to White employees, given that all other variables are held constant. This implies that the likelihood of increasing the rating of the supervisors’ support for work-life balance by 1 unit is the lowest is amongst Asian employees. Overall, it seems that there is no gender differences in odds for rating the supervisor’s support for work-life balance but this will be discussed further in models considering gender interaction effects.
## Call:
## polr(formula = Q21_factor ~ DRNO + gender + ancestry + disability +
## above40 + DFEDTEN + work_experience + work_satisfaction,
## data = final_cleaned_data_sample, Hess = T)
##
## Coefficients:
## Value Std. Error t value
## DRNOBlack or African American -0.20692 0.01922 -10.7679
## DRNOAsian -0.32622 0.02846 -11.4633
## DRNOOthers -0.24515 0.02617 -9.3685
## gender1 0.01024 0.01352 0.7573
## ancestry -0.30368 0.02279 -13.3259
## disability -0.05601 0.01910 -2.9330
## above40 -0.26224 0.01781 -14.7267
## DFEDTENBetween 11 and 20 -0.08712 0.01625 -5.3610
## DFEDTENMore than 20 -0.06583 0.01875 -3.5109
## work_experience 1.44967 0.01518 95.4801
## work_satisfaction 0.94038 0.01327 70.8741
##
## Intercepts:
## Value Std. Error t value
## 1|2 3.8038 0.0412 92.4104
## 2|3 5.1378 0.0413 124.4730
## 3|4 6.6291 0.0435 152.5602
## 4|5 9.2229 0.0485 190.0623
##
## Residual Deviance: 182612.56
## AIC: 182642.56
Value | Std. Error | t value | p value | |
---|---|---|---|---|
DRNOBlack or African American | -0.2069208 | 0.0192165 | -10.7678632 | 0.0000000 |
DRNOAsian | -0.3262162 | 0.0284573 | -11.4633492 | 0.0000000 |
DRNOOthers | -0.2451483 | 0.0261673 | -9.3685001 | 0.0000000 |
gender1 | 0.0102391 | 0.0135207 | 0.7572956 | 0.4488728 |
ancestry | -0.3036769 | 0.0227884 | -13.3259333 | 0.0000000 |
disability | -0.0560092 | 0.0190966 | -2.9329500 | 0.0033576 |
above40 | -0.2622402 | 0.0178071 | -14.7267140 | 0.0000000 |
DFEDTENBetween 11 and 20 | -0.0871157 | 0.0162498 | -5.3610303 | 0.0000001 |
DFEDTENMore than 20 | -0.0658332 | 0.0187513 | -3.5108533 | 0.0004467 |
work_experience | 1.4496733 | 0.0151830 | 95.4801318 | 0.0000000 |
work_satisfaction | 0.9403800 | 0.0132683 | 70.8741148 | 0.0000000 |
1|2 | 3.8037632 | 0.0411617 | 92.4103530 | 0.0000000 |
2|3 | 5.1377643 | 0.0412761 | 124.4730411 | 0.0000000 |
3|4 | 6.6291067 | 0.0434524 | 152.5602400 | 0.0000000 |
4|5 | 9.2229195 | 0.0485258 | 190.0622589 | 0.0000000 |
As seen from Model 6
regression coefficients table, the output shows that for employees who identify themselves as Black or African American
, Asian
and Others
racial categories, the log odds of giving a higher scale rating by 1 unit for supervisor’s support for employee’s development decreases by 0.21, 0.33 and 0.25 points as compared to their White colleagues respectively. The log odds are not statistically different between males and females federal employees.
odds_ratio | 2.5 % | 97.5 % | |
---|---|---|---|
DRNOBlack or African American | 0.8130840 | 0.7830538 | 0.8442982 |
DRNOAsian | 0.7216491 | 0.6825780 | 0.7630796 |
DRNOOthers | 0.7825885 | 0.7435056 | 0.8237948 |
gender1 | 1.0102917 | 0.9839102 | 1.0373900 |
ancestry | 0.7380993 | 0.7059074 | 0.7717703 |
disability | 0.9455304 | 0.9108588 | 0.9816098 |
above40 | 0.7693262 | 0.7429272 | 0.7966398 |
DFEDTENBetween 11 and 20 | 0.9165711 | 0.8878381 | 0.9462294 |
DFEDTENMore than 20 | 0.9362871 | 0.9025010 | 0.9713344 |
work_experience | 4.2617219 | 4.1368669 | 4.3905517 |
work_satisfaction | 2.5609543 | 2.4953971 | 2.6284533 |
Another interpretation from the perspective of using odds ratio seen in Model 6
is that for employees who are Black/African American, Asian and Other races, the odds of an increase in 1 unit in Likert scale rating for supervisor’s support for employee’s development decreases by 19%, 28% and 22% respectively relative to White employees. Asian employees are least likely to give a higher rating of supervisors’ support in the area of employee development. Overall, it seems that there is no gender differences for rating the supervisor’s support for work-life balance but this will be discussed further in models considering gender interaction effects.
To explore the gender interaction effects using the parameterized ordinal logistic regression model, we will add the interaction terms between each ethnicity group with gender in addition to the equations used in Chapter 7B.
The model equations for Model 7
and Model 8
are as follow:
\[ \begin{aligned} \log\left[ \frac { P( \operatorname{1} \geq \operatorname{2} ) }{ 1 - P( \operatorname{1} \geq \operatorname{2} ) } \right] &= \alpha_{1} + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience})\ + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction})\ + \beta_{12}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}} \times \operatorname{gender})\ + \\ &\quad \beta_{13}(\operatorname{DRNO}_{\operatorname{Asian}} \times \operatorname{gender}) + \beta_{14}(\operatorname{DRNO}_{\operatorname{Others}} \times \operatorname{gender}) + \epsilon \\ \log\left[ \frac { P( \operatorname{2} \geq \operatorname{3} ) }{ 1 - P( \operatorname{2} \geq \operatorname{3} ) } \right] &= \alpha_{2} + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience})\ + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction})\ + \beta_{12}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}} \times \operatorname{gender})\ + \\ &\quad \beta_{13}(\operatorname{DRNO}_{\operatorname{Asian}} \times \operatorname{gender}) + \beta_{14}(\operatorname{DRNO}_{\operatorname{Others}} \times \operatorname{gender}) + \epsilon \\ \log\left[ \frac { P( \operatorname{3} \geq \operatorname{4} ) }{ 1 - P( \operatorname{3} \geq \operatorname{4} ) } \right] &= \alpha_{3} + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience})\ + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction})\ + \beta_{12}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}} \times \operatorname{gender})\ + \\ &\quad \beta_{13}(\operatorname{DRNO}_{\operatorname{Asian}} \times \operatorname{gender}) + \beta_{14}(\operatorname{DRNO}_{\operatorname{Others}} \times \operatorname{gender}) + \epsilon \\ \log\left[ \frac { P( \operatorname{4} \geq \operatorname{5} ) }{ 1 - P( \operatorname{4} \geq \operatorname{5} ) } \right] &= \alpha_{4} + \beta_{1}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}}) + \beta_{2}(\operatorname{DRNO}_{\operatorname{Asian}}) + \beta_{3}(\operatorname{DRNO}_{\operatorname{Others}})\ + \\ &\quad \beta_{4}(\operatorname{gender}) + \beta_{5}(\operatorname{ancestry}) + \beta_{6}(\operatorname{disability}) + \beta_{7}(\operatorname{above40})\ + \\ &\quad \beta_{8}(\operatorname{DFEDTEN}_{\operatorname{Between\ 11\ and\ 20}}) + \beta_{9}(\operatorname{DFEDTEN}_{\operatorname{More\ than\ 20}}) + \beta_{10}(\operatorname{work\_experience})\ + \\ &\quad \beta_{11}(\operatorname{work\_satisfaction})\ + \beta_{12}(\operatorname{DRNO}_{\operatorname{Black\ or\ African\ American}} \times \operatorname{gender})\ + \\ &\quad \beta_{13}(\operatorname{DRNO}_{\operatorname{Asian}} \times \operatorname{gender}) + \beta_{14}(\operatorname{DRNO}_{\operatorname{Others}} \times \operatorname{gender}) + \epsilon \end{aligned} \]
## Call:
## polr(formula = Q19_factor ~ DRNO + gender + DRNO * gender + ancestry +
## disability + above40 + DFEDTEN + work_experience + work_satisfaction,
## data = final_cleaned_data_sample, Hess = T)
##
## Coefficients:
## Value Std. Error t value
## DRNOBlack or African American -0.268536 0.02552 -10.52429
## DRNOAsian -0.506707 0.04168 -12.15849
## DRNOOthers -0.407996 0.03760 -10.84968
## gender1 -0.036148 0.01627 -2.22195
## ancestry -0.333107 0.02308 -14.42997
## disability -0.025374 0.01946 -1.30391
## above40 -0.219406 0.01822 -12.03897
## DFEDTENBetween 11 and 20 -0.075253 0.01661 -4.53185
## DFEDTENMore than 20 -0.078425 0.01917 -4.09156
## work_experience 1.111627 0.01498 74.22118
## work_satisfaction 0.768457 0.01327 57.89484
## DRNOBlack or African American:gender1 0.001118 0.03928 0.02847
## DRNOAsian:gender1 0.079678 0.05676 1.40368
## DRNOOthers:gender1 0.237701 0.05302 4.48341
##
## Intercepts:
## Value Std. Error t value
## 1|2 2.1947 0.0410 53.5543
## 2|3 3.1745 0.0400 79.4267
## 3|4 4.2698 0.0405 105.2970
## 4|5 6.7092 0.0444 151.2243
##
## Residual Deviance: 175444.09
## AIC: 175480.09
Value | Std. Error | t value | p value | |
---|---|---|---|---|
DRNOBlack or African American | -0.2685364 | 0.0255159 | -10.5242931 | 0.0000000 |
DRNOAsian | -0.5067065 | 0.0416751 | -12.1584898 | 0.0000000 |
DRNOOthers | -0.4079963 | 0.0376044 | -10.8496849 | 0.0000000 |
gender1 | -0.0361477 | 0.0162684 | -2.2219550 | 0.0262863 |
ancestry | -0.3331074 | 0.0230844 | -14.4299684 | 0.0000000 |
disability | -0.0253741 | 0.0194600 | -1.3039065 | 0.1922655 |
above40 | -0.2194060 | 0.0182246 | -12.0389719 | 0.0000000 |
DFEDTENBetween 11 and 20 | -0.0752534 | 0.0166055 | -4.5318488 | 0.0000058 |
DFEDTENMore than 20 | -0.0784250 | 0.0191675 | -4.0915621 | 0.0000428 |
work_experience | 1.1116275 | 0.0149772 | 74.2211834 | 0.0000000 |
work_satisfaction | 0.7684567 | 0.0132733 | 57.8948396 | 0.0000000 |
DRNOBlack or African American:gender1 | 0.0011182 | 0.0392778 | 0.0284686 | 0.9772884 |
DRNOAsian:gender1 | 0.0796778 | 0.0567634 | 1.4036829 | 0.1604133 |
DRNOOthers:gender1 | 0.2377014 | 0.0530180 | 4.4834072 | 0.0000073 |
1|2 | 2.1947238 | 0.0409812 | 53.5543435 | 0.0000000 |
2|3 | 3.1744743 | 0.0399673 | 79.4267212 | 0.0000000 |
3|4 | 4.2697891 | 0.0405499 | 105.2970389 | 0.0000000 |
4|5 | 6.7092300 | 0.0443661 | 151.2242791 | 0.0000000 |
As evidenced from Model 7
, the log odds of male employees who belong to Others
race category giving a higher scale rating by 1 unit for supervisor’s support for work-life balance relative to females increase by 0.24 points. However, the log odds of White male employees decreases by 0.04 points as compared to White females. There are no significant gender differences in terms of log odds of giving a higher scale rating by 1 unit for Black/African Americans and Asians employees.
DRNO
and gender
is significant.
LR Chisq | Df | Pr(>Chisq) | |
---|---|---|---|
DRNO | 294.917761 | 3 | 0.0000000 |
gender | 4.943669 | 1 | 0.0261864 |
ancestry | 204.712100 | 1 | 0.0000000 |
disability | 1.699112 | 1 | 0.1924041 |
above40 | 145.830362 | 1 | 0.0000000 |
DFEDTEN | 24.501065 | 2 | 0.0000048 |
work_experience | 5725.485214 | 1 | 0.0000000 |
work_satisfaction | 3393.975827 | 1 | 0.0000000 |
DRNO:gender | 21.578390 | 3 | 0.0000798 |
odds_ratio | 2.5 % | 97.5 % | |
---|---|---|---|
DRNOBlack or African American | 0.7644976 | 0.7272039 | 0.8037186 |
DRNOAsian | 0.6024766 | 0.5556807 | 0.6537650 |
DRNOOthers | 0.6649813 | 0.6181954 | 0.7154407 |
gender1 | 0.9644978 | 0.9341877 | 0.9957393 |
ancestry | 0.7166932 | 0.6852720 | 0.7496071 |
disability | 0.9749451 | 0.9387744 | 1.0125808 |
above40 | 0.8029957 | 0.7748100 | 0.8321706 |
DFEDTENBetween 11 and 20 | 0.9275084 | 0.8978190 | 0.9581641 |
DFEDTENMore than 20 | 0.9245714 | 0.8904689 | 0.9599673 |
work_experience | 3.0393008 | 2.9515095 | 3.1299646 |
work_satisfaction | 2.1564357 | 2.1011115 | 2.2133266 |
DRNOBlack or African American:gender1 | 1.0011188 | 0.9270084 | 1.0805397 |
DRNOAsian:gender1 | 1.0829381 | 0.9698571 | 1.2089720 |
DRNOOthers:gender1 | 1.2683304 | 1.1433057 | 1.4070905 |
From the perspective of odds ratios, the estimated odds for male employees who are in the Others
racial category increasing their rating of their supervisor’s support for work-life balance by 1 unit are 1.26 times the odds for female employees. For White male employees, the estimated odds of increasing their rating of their supervisor’s support for work-life balance by 1 unit decreases very slightly by 4% relative to female employees. We do not observe significant differences in gender effects for employees who are Black/African Americans and Asians in their odds for increasing their rating of their supervisors by 1 Likert scale unit.
## Call:
## polr(formula = Q21_factor ~ DRNO + gender + DRNO * gender + ancestry +
## disability + above40 + DFEDTEN + work_experience + work_satisfaction,
## data = final_cleaned_data_sample, Hess = T)
##
## Coefficients:
## Value Std. Error t value
## DRNOBlack or African American -0.25351 0.02514 -10.084
## DRNOAsian -0.37884 0.04155 -9.117
## DRNOOthers -0.35914 0.03690 -9.733
## gender1 -0.02622 0.01585 -1.654
## ancestry -0.30599 0.02279 -13.424
## disability -0.05717 0.01910 -2.993
## above40 -0.26210 0.01781 -14.715
## DFEDTENBetween 11 and 20 -0.08664 0.01625 -5.331
## DFEDTENMore than 20 -0.06394 0.01877 -3.406
## work_experience 1.44973 0.01518 95.478
## work_satisfaction 0.94034 0.01327 70.870
## DRNOBlack or African American:gender1 0.10148 0.03886 2.611
## DRNOAsian:gender1 0.09479 0.05665 1.673
## DRNOOthers:gender1 0.22469 0.05221 4.304
##
## Intercepts:
## Value Std. Error t value
## 1|2 3.7823 0.0415 91.1879
## 2|3 5.1163 0.0416 123.0306
## 3|4 6.6078 0.0437 151.0664
## 4|5 9.2021 0.0488 188.6746
##
## Residual Deviance: 182588.40
## AIC: 182624.40
Value | Std. Error | t value | p value | |
---|---|---|---|---|
DRNOBlack or African American | -0.2535128 | 0.0251391 | -10.084405 | 0.0000000 |
DRNOAsian | -0.3788448 | 0.0415526 | -9.117225 | 0.0000000 |
DRNOOthers | -0.3591365 | 0.0368975 | -9.733346 | 0.0000000 |
gender1 | -0.0262172 | 0.0158480 | -1.654288 | 0.0262863 |
ancestry | -0.3059855 | 0.0227941 | -13.423886 | 0.0000000 |
disability | -0.0571669 | 0.0191032 | -2.992524 | 0.1922655 |
above40 | -0.2620973 | 0.0178111 | -14.715390 | 0.0000000 |
DFEDTENBetween 11 and 20 | -0.0866450 | 0.0162538 | -5.330745 | 0.0000058 |
DFEDTENMore than 20 | -0.0639356 | 0.0187715 | -3.405994 | 0.0000428 |
work_experience | 1.4497282 | 0.0151839 | 95.478066 | 0.0000000 |
work_satisfaction | 0.9403381 | 0.0132684 | 70.870309 | 0.0000000 |
DRNOBlack or African American:gender1 | 0.1014838 | 0.0388613 | 2.611435 | 0.9772884 |
DRNOAsian:gender1 | 0.0947903 | 0.0566522 | 1.673198 | 0.1604133 |
DRNOOthers:gender1 | 0.2246898 | 0.0522073 | 4.303803 | 0.0000073 |
1|2 | 3.7822530 | 0.0414776 | 91.187946 | 0.0000000 |
2|3 | 5.1163236 | 0.0415858 | 123.030592 | 0.0000000 |
3|4 | 6.6077959 | 0.0437410 | 151.066402 | 0.0000000 |
4|5 | 9.2020683 | 0.0487722 | 188.674604 | 0.0000000 |
As evidenced from Model 8
, the log odds of male employees who belong to Others
race category giving a higher scale rating by 1 unit for supervisor’s support for employee’s development relative to females increase by 0.22 points. On the contrary, the log odds of White male employees giving a higher scale rating by 1 unit for supervisor’s support for employee’s development relative to females decreases very slightly by 0.03 points. There are no significant gender differences in terms of log odds of giving a higher scale rating by 1 unit for Black/African Americans and Asians.
Similar to what we have done for Model 7
, we use the ANOVA function from the car package. The ANOVA result shows that the interaction between DRNO
and gender
is significant.
LR Chisq | Df | Pr(>Chisq) | |
---|---|---|---|
DRNO | 220.112758 | 3 | 0.0000000 |
gender | 2.738216 | 1 | 0.0979741 |
ancestry | 178.128484 | 1 | 0.0000000 |
disability | 8.936102 | 1 | 0.0027959 |
above40 | 217.784730 | 1 | 0.0000000 |
DFEDTEN | 28.895518 | 2 | 0.0000005 |
work_experience | 9696.510782 | 1 | 0.0000000 |
work_satisfaction | 5128.300414 | 1 | 0.0000000 |
DRNO:gender | 24.159746 | 3 | 0.0000231 |
odds_ratio | 2.5 % | 97.5 % | |
---|---|---|---|
DRNOBlack or African American | 0.7760698 | 0.7387783 | 0.8152724 |
DRNOAsian | 0.6846519 | 0.6312876 | 0.7426597 |
DRNOOthers | 0.6982790 | 0.6496450 | 0.7506941 |
gender1 | 0.9741235 | 0.9443234 | 1.0048510 |
ancestry | 0.7363973 | 0.7042754 | 0.7700431 |
disability | 0.9444364 | 0.9097996 | 0.9804568 |
above40 | 0.7694362 | 0.7430023 | 0.7967458 |
DFEDTENBetween 11 and 20 | 0.9170026 | 0.8882420 | 0.9466633 |
DFEDTENMore than 20 | 0.9380654 | 0.9041993 | 0.9731952 |
work_experience | 4.2619561 | 4.1371660 | 4.3908870 |
work_satisfaction | 2.5608470 | 2.4952205 | 2.6284359 |
DRNOBlack or African American:gender1 | 1.1068120 | 1.0258429 | 1.1943653 |
DRNOAsian:gender1 | 1.0994282 | 0.9839471 | 1.2285032 |
DRNOOthers:gender1 | 1.2519343 | 1.1302048 | 1.3867879 |
From the above odds ratio table for Model 8
, the estimated odds for male employees who identify themselves as Black or African American and Others
racial categories increasing their rating of their supervisor’s support on employee’s development by 1 unit are about 1.11 and 1.25 times the odds for female employees respectively. There is no significant difference between genders for White and Asian employees in their odds for increasing their rating of their supervisors by 1 Likert scale unit.
In Chapter 8, we subsequently calculate the predicted probabilities and odds ratios for various combination of focal predictors - ethnicity groups and gender, while holding other predictors at their fixed values.
We will first plot the predicted probabilities of Model 5
and Model 6
in sub-section 8A.
In sub-section 8B, we plot the predicted odds ratio across different ethnicity groups for males and females separately. The effect displays are created with “latent” option activated. In these plots, the y axis is on the logit scale, which we interpret to be a latent, or hidden, scale from which the ordered categories are derived.
Collectively based on the effect displays of Model 5
& Model 6
, it appears that generally federal employees are highly likely to rate their supervisor’s support for work-life balance and development on the scale of 4 and higher.
White employees are very likely to give the highest rating of 5 on their supervisors’ support for work-life balance relative to other racial groups, while Black/African Americans and Asian employees are more likely to give a slightly lower rating of 4. As for supervisor’s support on employee development, all racial groups are almost equally likely to give a slightly lower rating of 4.
As seen from the effect displays of Model 7
& Model 8
, it appears that the predicted scale rating of White female employees on their supervisors’ support for work-life balance is 5, which are higher relative to females in the other 3 racial groups ranging from 4 to borderline 4-5. The predicted scale ratings of females who identify themselves as African Americans
and Others
are lowest at 4.
For male employees, the predicted scale rating of White male employees on their supervisors’ support for work-life balance is also 5, but we do observe males who are Black or African Americans
and Others
have some degree of uncertainty at borderline 4-5. Asian male employees has the lowest predicted rating of 4.
As for employees’ development, the predicted scale rating of all federal employees is 4. Female employees who are Whites
are far more likely to give a better rating for their supervisors than non-White female employees. Similar trend is observed for male employees, but they are relatively more similar in their predicted rating outcome due to overlaps in the confidence intervals. Overall, Asian employees are least likely among all racial groups to rate their supervisors’ support towards work-life balance and employee’s development more favorably.
From our data science project, we could find the following two findings:
The relationships between ethnicity and supervisor’s support (binary) in terms of work-life balance and employee’s development differ depending on one’s gender. In particular, male employees who identify themselves as Black or African Americans
are less likely to agree than females with the statement that their supervisors supported their needs to balance work and other life issues and personal development. This correlation seems to validate the phenomenon that “Black males may face a different social reality (including interpersonal relationships at workplace) from their female counterparts”. However, such gender disparity is not observed among Asian federal employees. On the contrary, White males are more likely to agree as compared to their White female colleagues with the statement that their supervisors support their need for work-life balance and personal development and even more so for males who identify themselves in the Others
racial category. This may be attributed to the fact that respondents in the Others
racial category form the smallest representation in the OPM Federal Employee Viewpoint Survey are better performers in their field of practice employed by the U.S. Federal Government and thus are more favored and/or receive better support from their supervisors.
Taking into account the ordinal nature of the ratings, the relationship between ethnicity and the degree of supervisor’s support towards work-life balance and employee’s development also differ depending on one’s gender. It appears that, among those who identify themselves in the Others
racial category, the estimated odds of giving a higher rating (i.e. from 4 to 5) on their supervisor’s support for work-life balance and employee development are higher for males than for females. An opposite relationship is seen among White employees, although the difference in odds ratio between genders is extremely small at 4%. As for Black/African employees, we do not observe significant difference between genders in giving a higher rating for their supervisor’s support for work-life balance but a 10% increase in odds ratio for Black males employees rating their supervisors favorably (e.g. scale rating from 4 to 5) in their support for employee development. This seem to contradict the earlier observation when we look at binary outcome variables for Black/African American and may require further investigation into our model assumptions for ordinal logistic regression. No gender differences are observed among Asian employees in terms of rating their supervisor’s support for work-life balance and employee development.
Despite the anti-discriminatory legislation and frameworks in place, our exploratory data analysis reflects a disconnect between the growing commitment to racial and gender equality and the lack of improvement that employees face in the day-to-day experiences of color and gender. Individuals who are non-White remain far more likely than Whites (or White males in particular) to be on the receiving end of fewer promotional or development opportunities and/or less likely to get the support they need.
It corroborates with a very recent work of Mckinsey & Company which have shown women of color lose ground to White women and men of color in corporate America. Especially in times of disruption that drives a fundamental change in the way people work, it is imperative for companies to be fully on board to create a culture that focuses on employee well-being, racial and gender equity, shaping an environment to make workers feel more engaged and valued at work. If managers are assessed and rewarded when such goals are met, it could potentially address the discrimination gap and lead to better corporate performance and financial gains.
Most of our predictor variables are binary or categorical, which may capture less information than what we hope to. In the case of age and number of years of federal work experience, such information would be valuable if the survey questions are designed to allow respondents to indicate numeric input.
In addition, greater care is needed when it comes to working with observational data where our exploratory analysis indicate that these variables are correlated with each other, and not causally linked. This project could be expanded to combine datasets across the years to form panel data, where we could use mixed effects methods to address within-subjects and between-subject changes and establish casual relationships.
Lastly, we need to look into the underlying assumption of proportional odds using the Brant Test (see Appendix) to check if parallel regression holds. Given that the p-values are less than the alpha of 0.05, we will have to explore different regressions models that relax this assumption to describe the relationship between each pair of outcome groups which is beyond the scope of the discussion.
In future, we can further improve the model with clustering to allow the algorithm to search for the optimal number of hidden categorizations instead of comparing across the 4 racial categories predefined in the survey.
Philip N. Cohen, “The Persistence of Workplace Gender Segregation in the US,” Sociology Compass 7, no. 11 (2013): pp. 889-899, https://doi.org/10.1111/soc4.12083.
Richard V. Reeves, Sarah Nzau, and Ember Smith, “The Challenges Facing Black Men – and the Case for Action,” Brookings (Brookings Institution, November 19, 2020), https://www.brookings.edu/blog/up-front/2020/11/19/the-challenges-facing-black-men-and-the-case-for-action/.
“Women in the Workplace 2021,” McKinsey & Company (McKinsey & Company, September 27, 2021), https://www.mckinsey.com/featured-insights/diversity-and-inclusion/women-in-the-workplace.
1. Brant Test for Ordinal Logistic Regression
library(brant)
brant(polr_model13)
## --------------------------------------------------------------------
## Test for X2 df probability
## --------------------------------------------------------------------
## Omnibus 745.28 33 0
## DRNOBlack or African American 28.65 3 0
## DRNOAsian 13.53 3 0
## DRNOOthers 27.6 3 0
## gender1 93.12 3 0
## ancestry 35.73 3 0
## disability 34.88 3 0
## above40 70.83 3 0
## DFEDTENBetween 11 and 20 1.08 3 0.78
## DFEDTENMore than 20 8.23 3 0.04
## work_experience 101.2 3 0
## work_satisfaction 151.82 3 0
## --------------------------------------------------------------------
##
## H0: Parallel Regression Assumption holds
brant(polr_model15)
## --------------------------------------------------------------------
## Test for X2 df probability
## --------------------------------------------------------------------
## Omnibus 579.87 33 0
## DRNOBlack or African American 26.16 3 0
## DRNOAsian 23.98 3 0
## DRNOOthers 19.87 3 0
## gender1 86.3 3 0
## ancestry 35.75 3 0
## disability 39.41 3 0
## above40 22.42 3 0
## DFEDTENBetween 11 and 20 11.09 3 0.01
## DFEDTENMore than 20 52.23 3 0
## work_experience 155.39 3 0
## work_satisfaction 89.96 3 0
## --------------------------------------------------------------------
##
## H0: Parallel Regression Assumption holds
brant(polr_model16)
## ----------------------------------------------------------------------------
## Test for X2 df probability
## ----------------------------------------------------------------------------
## Omnibus 754.74 42 0
## DRNOBlack or African American 8.58 3 0.04
## DRNOAsian 10.13 3 0.02
## DRNOOthers 12.29 3 0.01
## gender1 72.99 3 0
## ancestry 35.51 3 0
## disability 34.91 3 0
## above40 70.99 3 0
## DFEDTENBetween 11 and 20 1.13 3 0.77
## DFEDTENMore than 20 8.06 3 0.04
## work_experience 101.04 3 0
## work_satisfaction 152.7 3 0
## DRNOBlack or African American:gender1 7.37 3 0.06
## DRNOAsian:gender1 5.7 3 0.13
## DRNOOthers:gender1 2.1 3 0.55
## ----------------------------------------------------------------------------
##
## H0: Parallel Regression Assumption holds
brant(polr_model18)
## ----------------------------------------------------------------------------
## Test for X2 df probability
## ----------------------------------------------------------------------------
## Omnibus 583.64 42 0
## DRNOBlack or African American 15.8 3 0
## DRNOAsian 16.31 3 0
## DRNOOthers 9.65 3 0.02
## gender1 68.88 3 0
## ancestry 36.03 3 0
## disability 39.52 3 0
## above40 22.28 3 0
## DFEDTENBetween 11 and 20 11.1 3 0.01
## DFEDTENMore than 20 52.34 3 0
## work_experience 155.72 3 0
## work_satisfaction 90.18 3 0
## DRNOBlack or African American:gender1 0.43 3 0.93
## DRNOAsian:gender1 5.75 3 0.12
## DRNOOthers:gender1 3.33 3 0.34
## ----------------------------------------------------------------------------
##
## H0: Parallel Regression Assumption holds
2. Diagnostics of multicollinearity
Variance inflation factor (VIF) helps in formal detection-tolerance for multicollinearity. Given that the correlation matrix has shown that work_experience
and work_satisfaction
are highly correlated with a coefficient of 0.817, this indicates a potential problem of multicollinearity and the need for further investigation. Hence, we run a multiple linear regression with the same predictor and outcome variables as our logistic regression models and compute the VIF using car package. Since VIF for all variables are below 4, this indicates an absence of multicollinearity issue.
<- lm(Q19_binary ~ DRNO + gender + ancestry + disability + above40 +
model1_lm + work_experience + work_satisfaction, data = final_cleaned_data)
DFEDTEN
::vif(model1_lm) %>%
car::datatable() DT
<- lm(Q21_binary ~ DRNO + gender + ancestry + disability + above40 +
model2_lm + work_experience + work_satisfaction, data = final_cleaned_data)
DFEDTEN
::vif(model2_lm) %>%
car::datatable() DT