Ethical Considerations in AI-Based Recruitment

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

2019 IEEE International Symposium on Technology in Society (ISTAS) Proceedings

Miriam Cunningham and Paul Cunningham (Eds)


ISBN: 978-1-7281-5480-0

Ethical Considerations in AI-Based Recruitment


Dena F. Mujtaba and Nihar R. Mahapatra
Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan, USA
{mujtabad, nrm}@egr.msu.edu

Abstract - Over the past few years, machine learning and AI


have become increasingly common in human resources (HR)
applications, such as candidate screening, resume parsing, and
employee attrition and turnover prediction. Though AI assists in
making these tasks more efficient, and seemingly less biased
through automation, it relies heavily on data created by humans,
and consequently can have human biases carry over to decisions
made by a model. Several studies have shown biases in machine
learning applications such as facial recognition and candidate
ranking. This has spurred active research on the topic of fairness
in machine learning over the last five years. Several toolkits to
mitigate biases and interpret black box models have been
developed in an effort to promote fair algorithms. This paper
presents an overview of fairness definitions, methods, and tools as
they relate to recruitment and establishes ethical considerations Fig. 1. Google search trend graph from Aug. 2010 – Aug. 2019 for the phrases
in the use of machine learning in the hiring space. “machine learning bias” (blue), “HR AI” (red), and “ethical hiring” (yellow),
showing relative interest over time, with 100 indicating peak popularity.
Keywords - artifical intelligence; bias; discrimination; ethical
recruitment; fairness; hiring. assigning lower scores to resumes of women when ranking
applicants [7]. The model’s bias was a result of under
I. INTRODUCTION representation of female applicants in the training dataset used
to create the model. This is an example of how biases
Advances in artificial intelligence (AI) and automation are
commonly found in the hiring process (e.g., hiring
expected to lead to widespread changes in the economy and
discrimination [8]) can easily carry over to AI-based
society, with nearly 14% of the global workforce possibly
approaches through the data used to train the algorithm.
needing to change occupational categories by 2030 due to
technological advances [1]. Recently, AI has been adopted in Companies using AI-based approaches for recruitment
the human resources (HR) industry, for purposes such as expect a more consistent and ethical approach to decision
predicting employee attrition, chatbot systems for HR service making and for them to possess the ability to remove biases in
delivery, and background verification for screening applicants’ contrast to a human decision maker [9]. However, a number of
resumes [2]. Initially, HR technology was an effective recent research studies have shown biases in machine learning
approach for tasks such as applicant resume screening and and AI-based applications [7, 10-12]. This has contributed to
candidate sourcing, but it has recently been recognized as a an increase in interest in fairness, machine learning bias, and
critical need for organizations to scale and expand their AI in recruitment, as depicted in the Google search trend graph
businesses [2]. Applications such as Arya, Google Hire, in Figure 1.
HireVue, and Plum use AI for applicant screening and
management [3-6], allowing businesses to grow efficiently, Past work and our contributions: We seek to present an
without the need for human decision making in these hiring overview of the work in fairness in machine learning and AI-
processes. based systems, with a view toward addressing the hiring
space. Previous surveys [10, 13-18] and work in fairness in
However, as AI adoption becomes more widespread, there machine learning have not addressed the types of biases that
is growing concern that there are several ways the decisions past studies have shown occur in the recruitment process, such
made by such systems could be carrying over biases from as confirmation bias (decisions made in the first few minutes
people in the organization or the model developers, as of meeting a candidate) and expectation anchor bias (being
evidenced by several recently-reported episodes. In 2017,
influenced by only one piece of information) [9], and how
Amazon ended its AI-based candidate evaluation tool because
it was shown to discriminate against female candidates, these biases that are carried over to the model can be

Authorized licensed use limited to: George Mason University. Downloaded on April 06,2024 at 14:07:32 UTC from IEEE Xplore. Restrictions apply.
mitigated. Therefore, we present an overview of methods for infer the gender of a candidate by name; this is known as
measuring bias, tools available for bias mitigation, and future indirect discrimination, or discrimination through use of
challenges for fairness in machine learning specific to features that are implicitly defined in the dataset [20].
recruitment applications. Furthermore, depending on the application, certain protected
The remainder of the paper is organized as follows. In attributes will be needed to prevent disparate impact. For
Section II, we first identify the various causes of bias and instance, the decision support tool COMPAS (Correctional
provide five core definitions of fairness. Next, in Section III, Offender Management Profiling for Alternative Sanctions),
we outline the recruitment process and discuss the ways in used to assess the recidivism rate of a defendant, was shown to
which bias may be introduced in AI-based recruitment discriminate against female defendants because the algorithms
processes. Section IV details the main categories of bias were gender-neutral, and women were shown to reoffend at
mitigation methods and the key fairness tools available to lower rates than men with similar criminal histories [10].
address bias in machine learning systems. Section V discusses Therefore, to define fairness for a decision-making system, the
limitations of the tools and methods for bias detection and causes of bias in the data should be addressed. This is
mitigation, and outlines future challenges in the area of AI- discussed next.
based recruitment. Finally, concluding remarks and findings
A. Causes of Bias
are given in Section VI.
There are several ways in which human bias can be
II. BACKGROUND transferred to the dataset used to train a machine learning
system. These include [11, 17, 18]:
Before measuring bias in algorithms, the concept of
“fairness” needs to be defined to consistently compare and 1) Training data: If the training data is biased in some
evaluate algorithms in AI-based recruitment. Several way, the machine learning system trained on it will learn the
definitions of fairness have been proposed in the past, bias. This may be through a skewed sample, in which
originating from anti-discrimination laws such as the Civil proportionately more records are present for a group achieving
Rights Act of 1964, which prohibits unfair treatment of a particular outcome versus another. The dataset may also be
individuals based on their protected attributes (e.g., gender or tainted in the labeled outcomes it contains (e.g., if the dataset
race) [19, 20]. Furthermore, the US Equal Employment was manually labeled by a human and any human bias was
Opportunity Commission (EEOC) established a requirement transferred to the labels).
with similar guidelines for employee selection procedures to 2) Label definitions: Depending on the problem being
ensure fair treatment of employees during the hiring process solved or the decision being made, the target label may
and to prevent adverse impact on individuals due to hiring
contain a vague description of the outcome, and thus result in
decisions made [11, 19]. Adverse impact, per EEOC guidelines,
incorrect predictions and a larger disparate impact. For
is determined by applying the four-fifths or eighty percent rule,
viz., a selection rate for a protected group which is less than instance, in the example provided in [11], if a manager were to
four-fifths (or 80%) of the rate for the group with the highest build a simple binary classification model to label a job
rate [17, 19]. An example of adverse impact is illustrated in candidate as a “good” hire (instead of modeling the different
Table 1a, where the selection rate for black applicants is 50% of ways in which a candidate could be a “good” hire), many
the white applicant selection rate (which is below the 80% factors may get obscured by the model’s prediction on the
threshold) [19]. candidate. Employee motivation, person-job fit, and person-
environment fit are a few factors that will typically determine
TABLE I. SAMPLE APPLICATION SELECTION RATE how well a candidate fits an organization and how well they
will perform once hired [16, 21, 22].
Applicants Hired Selection Rate Percent Hired
3) Feature selection: Features used to improve the model
80 White 48 48/80 60%
may result in an unfair prediction [23]. For instance, certain
40 Black 12 12/40 30% features may not be relevant to the real-world application of
a.
Example adopted from EEOC guidelines [19]. the model, resulting in bias against protected groups. In
addition, certain features may be derived from
Two key definitions of discrimination were established unreliable/inaccurate data, which will then result in a lower
based on these laws: disparate treatment, which is an prediction accuracy for certain groups.
intentionally discriminatory practice targeted at individuals 4) Proxies: Even with protected attributes removed from
based on their protected attributes; and disparate impact, the dataset, they may still be found in other attributes, and still
which includes practices that result in a disproportionately result in biased decisions being made. For instance, Amazon’s
adverse impact on protected groups [12, 19, 20]. While these hiring application was biased even without using the gender
standards would seem simple to meet by simply removing attribute, because it inferred it from the educational institution
protected attributes from the training data, there are still many listed on the resume of applicants (e.g., all female college or
instances where this can still lead to unfair decisions. For all-male colleges) [7].
instance, when using natural language processing, a model can

Authorized licensed use limited to: George Mason University. Downloaded on April 06,2024 at 14:07:32 UTC from IEEE Xplore. Restrictions apply.
5) Masking: To remove any protected attributes or proxies fairness, this can address limitations of group fairness
of these attributes that may lead to disparate impact, new definitions [25].
features may be formed to replace these attributes and achieve 5) Counterfactual fairness: Counterfactual fairness is
a new representation of the data (masked features) [11]. motivated by the idea of transparency, where even if an
However, this may also lead to new biases from the features algorithm makes a decision that may seem biased, an
selected by the human masking the protected features. explanation for this decision is provided. This can assist in
finding bias in algorithms, by replacing attributes that may
B. Definitions of Fairness
Several definitions of fairness that can be used to assess
machine learning systems have been proposed in past
literature. Below we cover the five core ones as defined in [17,
18, 24]:
1) Demographic parity: Also known as statistical parity, Fig. 2. An overview of the steps in a typical recruitment pipeline: (1)
this states that acceptance rates of two groups must be equal identifying and attracting candidates, (2) candidate screening or processing,
(e.g., the percent of people accepted for a position in one and (3) communication and selection of candidates.
group must be close to the percent accepted for another). It
aligns with laws requiring fair hiring processes (e.g., the four- seem protected, and observing the change in decisions.
fifths rule), and is formulated in (1) [18], where represents
III. BIAS IN AI-BASED RECRUITMENT
the features of an individual, represents the protected
attributes, is a classifier with a binary outcome , and Understanding the recruitment process and where biases
. Then, for groups and , the selection rate for can occur will assist in employers applying methods to mitigate
both must be equal or within p percent of each other, where bias. The recruitment process consists of the following steps, as
identified in [26], and shown in Figure 2. These are further
expanded on below.

A. Identifying and Attracting Candidates


    
After identifying the need for a position, and posting it, an
employer may search for candidates or accept resumes sent as a
2) Accuracy parity: Accuracy parity is the notion of being
result of the posting. Often, recruiters on job sites such as
able to provide a more balanced version of the data, in which
LinkedIn and Indeed will observe rankings of candidates
we can trade a false positive of one group for a false negative
according to job-fit/similarly, and the employer will have the
of another, so that it fits the definition in (2) [17]; this states resumes submitted to the posting ranked. There is much
we should hire (C = 1) equal proportion of candidates from the literature on selecting the best candidate from a pool of
qualified ones (Y = 1) in each group (also known as equality candidates to balance onboarding costs with a candidate’s
of opportunity). current experience, KSAOs (knowledge, skills, abilities, and
other traits), and person-environment fit [16, 27, 28]. However,
with AI implemented in several outsourced recruitment
 a C   =b C  }   and applicant
systems  tracking systems (ATSs), the employer
may play little to no part in selecting the list of top candidates.
3) Predictive rate parity: Also known as positive How an AI system will rank these candidates depends on past
predictive parity and negative predictive parity, this requires data used to train these ranking models. The biases present in
that the credentials of a candidate be consistent with the the data may lead to invalid rankings, and can be costly for
model’s prediction across different groups. Specifically, both organizations who will then need to spend more time and
conditions represented in (3) (positive predictive parity) and resources for training and onboarding, or will experience a
(4) (negative predictive parity) [12, 17] for the classifier C lower employee retention and job satisfaction rate [29].
must be satisfied to have predictive rate parity.
B. Candidate Screening/Processing
 a Y = 1 | C = 1 =b Y = 1 | C = 1}  After
 ranking
 candidates from a list, an initial pool of
resumes or candidates is established, and employers start
 a Y = 0 | C = 0 =b Y = 0 | C = 0}  screening
 candidates
 to better assess fit. Methods for screening
include resume parsing systems, job experience and knowledge
assessments, personality assessments, and in-person phone
4) Individual fairness (versus group fairness): Individual calls and video streams [16, 26, 28]. As AI is further
fairness is different from the above fairness definitions, in that incorporated in the recruitment process, how assessments and
fairness is established on a person-by-person basis instead of resumes are parsed are of increasing concern since they have
on a group basis. It states that similar individuals should have been subject to bias in the past, as seen in Amazon’s resume
similar outcomes. Using a more fine-grained approach for parsing/ranking application [7].

Authorized licensed use limited to: George Mason University. Downloaded on April 06,2024 at 14:07:32 UTC from IEEE Xplore. Restrictions apply.
C. Communication and Selection of Candidates model, either by setting a threshold for certain classifications
This step is often managed by the employer, and is the last already made, or providing transparency in the algorithm
stage of the recruitment process, wherein the employer meets through counterfactuals. This allows fairness defintions to be
the candidate, either in person, through a call, or video stream. met without directly modifiying the classifier. In addition,
There are not many AI systems that replace the interview Model Fairness and Interpretability Tools vs. GitHub Engagement
process, largely because research has shown in-person
interviews have a better success rate than offline interviews
[27]. However, as AI becomes more commonplace, there are
several areas, such as image, audio, and video classification,
where bias has been studied [10, 30], and methods for
remedying such bias will need to be applied.
To assist in a more ethical hiring process, by providing a
transparent and fair approach for candidates and employers,
we discuss methods for mitigating bias in algorithms in each
stage of the recruitment process.

IV. BIAS DETECTION & MITIGATION


Bias mitigation in hiring tools will need to be applied to
each step of the recruitment process, depending on the model
used. As described in [16], the hiring process should focus on
achieving fairness, among other things, through: (1)
accountability (interviewer’s obligations to make reasonable
decisions and address any mistakes an AI system may make),
(2) responsibility (focusing on developing a fair and reliable
AI system that also adapts to changes), (3) transparency
(explanations of the decisions made by the AI system and
employer).
A. Methods of Bias Mitigation
There are three key classes of bias mitigation algorithms/ Fig. 3. Overview of user/developer engagement on fairness and
methods [31]: interpretability repositories on GitHub, measuring the number of stars (i.e.,
marked as a favorite by a user), watch count (i.e., user notified of updates),
1) Pre-processing: This involves modifying the dataset and forks (i.e., user copies project code to contribute or customize).
before it is used to train the model. Algorithms such as
reweighting and optimized preprocessing are used to edit the providing counterfactuals will assist not only the employer,
features and labels in the data according to fairness criteria but also the candidate in understanding their weaknesses or
before classification [31]. This can be used to remove any areas of mismatch for the job, which can then be used to
obvious protected attributes unrelated to the position, or improve performance in the next interview. Furthermore, clear
modify any features that would lead to bias. This also benefits and understandable feedback from an interview or decision
instances where the model itself cannot be modified, and can has been shown to improve perceived fairness of a system
be used across recruitment tasks if multiple models are used [32].
for multiple tasks.
2) In-processing/optimization: In this approach, the B. Tools
model is optimized to meet any fairness definition described Several toolkits and programs have been released to assist
earlier through the setting of constraints on the classification developers in embedding fairness algorithms in their machine
objective (e.g., achieving accuracy parity by declaring we learning pipeline [31, 33-49]. These toolkits focus on one or
should hire an equal proportion of candidates from qualified more algorithms from the above three categories and are open
individuals in different groups). The benefit of this method is source for anyone to use and contribute to, leading to more
the high performance achieved on fairness measures. engagement in the machine learning and AI space. In Figure 3,
However, this also requires modifying the classifier model, we present an overview of user engagement in popular
which may not be an option when outsourcing the recruitment fairness and interpretability resources on GitHub. This is
process. Furthermore, this may modify the accuracy of the measured using the number of stars (i.e., a user marks the
classifier, depending on the fairness defintion used (e.g., project as a ‘favorite’), forks (i.e., a user copied the project to
accuracy parity conditions may result in unqualified applicants either customize for their own needs or contribute to the
being hired, to meet equal outcomes). original project), and the watch count (i.e., number of users
which have notifications of updates on the project turned on).
3) Post-processing/counterfactuals: This is the process of
meeting fairness constraints by modifying the outcome of the

Authorized licensed use limited to: George Mason University. Downloaded on April 06,2024 at 14:07:32 UTC from IEEE Xplore. Restrictions apply.
Below, we describe some of the larger projects in the area of discussion lead to more awareness of the project application
model fairness and interpretability: and better maintainability over time [51].
 AIF360 [31]: AI Fairness 360 (AIF360) is a toolkit 2) Job-specific requirements: Requirements often vary
developed by IBM that packages bias mitigation across occupations, and the representation of candidates and
algorithms and fairness metrics for developers to the protected attributes will also need to change accordingly.
implement in their models. This toolkit, written in Python, Developing/training a single model to detect protected
focuses on industrial settings that would benefit from attributes in datasets across occupations will not work for
fairness algorithms, and allows for bias mitigation in all occupations that require attributes normally considered
three of the stages described earlier. protected (e.g., information about religion when applying to be
 FairML [35]: FairML focuses on assisting in the post- a leader in a religious institute).
processing stage of bias mitigation, by explaining the 3) Fairness in job postings: Not only does fairness apply
significance of a model’s features. for the candidate in the organization, but also outside as a
 FairTest [34]: FairTest is a toolkit to assess the features worker/candidate searches for jobs. How job postings are
with the highest impact on a classifier (using an recommended will vary based on a candidates’ uploaded
unwarranted association framework). Decision trees are
resume or past work experience listed on the job requirement
used to divide the dataset into subgroups and assess the
site. However, these may also be subject to unintentional
strength of features on an outcome, allowing developers
biases, depending on how the employer wrote the job
to check any protected attributes.
description. Therefore, this will hinder organizations from
 InterpretML [37]: Though currently in alpha, this
project has high engagement on GitHub. Its focus is to finding the best candidates.
explain black box models and develop new, easily 4) Hiring decision transparency: After applying, an
interpretable models. applicant may not hear back from the employer, depending on
 Lime [38]: Lime is a toolkit which focuses on the system in place that screens their profile. Often, these
understanding black box machine learning models, by responses are not detailed enough to provide the candidate
identifying features with the highest contribution to model information on the criteria they did not meet. Therefore,
decisions. counterfactual fairness will be needed for explainability.
 Lucid [33]: Lucid is a collection of tools and Jupyter VI. CONCLUSION
notebooks to assist in neural network research, and for
interpretability of networks. Although several advances in AI applications for
 SHAP [39]: Like Lime, SHAP explains the features recruitment have been made and used in current recruitment
contributing to a model’s decision, but uses game theory systems, there has been a lack of fairness studies using these
for user-understandable explanations and provides many models. This paper presents an overview of the recruitment
visualization features of the data. process, implications of bias in these models, and methods for
 Themis-ML [36]: Like AIF360, Themis-ML provides bias mitigation which can be adopted. Much work has been
tools for all three classes of fairness algorithms, by done in model interpretability and post-processing methods for
providing pre-processing methods, pre-built classification achieving fairness. However, there is scope for more work in
models that meet fairness criteria, and post-processing several areas: optimization and pre-processing methods,
awareness estimators. models to meet occupation-specific requirements, fair job
postings to reach a wider audience, and transparency in
V. LIMITATIONS AND FUTURE CHALLENGES recruitment decisions. These challenges are an opportunity for
Though several approaches for bias mitigation have been organizations and machine learning researchers to transform
provided in toolkits and papers in the field of machine the hiring process for applicants around the world for the
learning fairness, there are problems specific to the hiring better, leading to more effective workplaces and more
space which are still unaddressed, and are challenges for motivated workers.
future work in fair AI-based recruitment:
1) Tool evaluation and selection: Organizations seeking ACKNOWLEDGMENT
to adopt one of the available fairness tools will need to assess This material is based upon work supported by the U.S.
which option best fits their hiring needs, since metrics vary National Science Foundation under Grant No. 1936857.
highly depending on the occupation. In a past study comparing
SHAP with Lime for model interpretability, results showed REFERENCES
that no single interpretability tool achieved the best [1] McKinsey Global Institute, "Jobs Lost, Jobs Gained: Workforce
performance across different types of data [50]. However, one Transitions in a Time of Automation," McKinsey & Company, 2017.
metric which can be used across these tools is engagement and [2] S. Biswas, "The Beginner’s Guide to AI in HR," HR Technologist, 5
February 2019. [Online]. Available:
activity of the project respository, as past studies have shown https://www.hrtechnologist.com/articles/digital-transformation/the-
that repositories with higher engagement and project beginners-guide-to-ai-in-hr/. [Accessed August 2019].
[3] Arya, "GoArya," Arya, 2019. [Online]. Available: https://goarya.com/.
[Accessed August 2019].

Authorized licensed use limited to: George Mason University. Downloaded on April 06,2024 at 14:07:32 UTC from IEEE Xplore. Restrictions apply.
[4] Google, "Google Hire," Google, 2017. [Online]. Available: Process and Candidate Relationship Management," German Journal of
https://hire.google.com/. [Accessed August 2019]. Human Resource Management, vol. 26, no. 3, pp. 241--259, 2012.
[5] HireVue, "HireVue About Us," HireVue, [Online]. Available: [27] J. D. Shapka, J. F. Domene, S. Khan and L. M. Yang, "Online Versus In-
https://www.hirevue.com/company/about-us. [Accessed August 2019]. Person Interviews with Adolescents: An Exploration of Data
Equivalence," Computers in Human Behavior, vol. 58, pp. 361--367,
[6] Plum.io, "Plum," Plum, [Online]. Available: https://www.plum.io/how- 2016.
it-works. [Accessed August 2019].
[28] D. S. Chapman and J. Webster, "The Use of Technologies in the
[7] D. Meyer, "Amazon Reportedly Killed an AI Recruitment System Recruiting, Screening, and Selection Processes for Job Candidates,"
Because It Couldn’t Stop the Tool from Discriminating Against International journal of selection and assessment, vol. 11, pp. 113--120,
Women," Fortune, 11 October 2018. [Online]. Available: 2003.
https://fortune.com/2018/10/10/amazon-ai-recruitment-bias-women-
sexist/. [Accessed August 2019]. [29] A. Snell, "Researching Onboarding Best Practice: Using Research to
Connect Onboarding Processes with Employee Satisfaction," Strategic
[8] N. Lewis, "Will AI Remove Hiring Bias?," SHRM, 12 November 2018. HR Review, vol. 5, no. 6, pp. 32--35, 2006.
[Online]. Available: https://www.shrm.org/resourcesandtools/hr-
topics/talent-acquisition/pages/will-ai-remove-hiring-bias-hr- [30] T. Wang, J. Zhao, K.-W. Chang, M. Yatskar and V. Ordonez,
technology.aspx. [Accessed August 2019]. "Adversarial Removal of Gender from Deep Image Representations,"
arXiv preprint arXiv:1811.08489, 2018.
[9] A. Johnson, "13 Common Hiring Biases To Watch Out For," Harver,
November 2018. [Online]. Available: https://harver.com/blog/hiring- [31] R. K. Bellamy, K. Dey, M. Hind, S. C. a. H. S. Hoffman, K. Kannan, P.
biases/. [Accessed September 2019]. Lohia, J. Martino, S. Mehta, A. Mojsilovic and et.al., "AI Fairness 360:
An Extensible Toolkit for Detecting, Understanding, and Mitigating
[10] S. Corbett-Davies and S. Goel, "The Measure and Mismeasure of Unwanted Algorithmic Bias," arXiv preprint arXiv:1810.01943, 2018.
Fairness: A Critical Review of Fair Machine Learning," arXiv preprint
arXiv:1808.00023, 2018. [32] S. W. Gilliland, "The Perceived Fairness of Selection Systems: An
Organizational Justice Perspective," Academy of management review,
[11] S. Barocas and D. Andrew, "Big Data’s Disparate Impact.," California vol. 18, no. 4, pp. 694--734, 1993.
Law Review, vol. 104, no. 3, pp. 1--57, 2014.
[33] Tensorflow, "Lucid," 2018. [Online]. Available:
[12] M. B. Zafar, I. Valera, M. Gomez Rodriguez and K. P. Gummadi, https://github.com/tensorflow/lucid. [Accessed August 2019].
"Fairness Beyond Disparate Treatment & Disparate Impact: Learning
Classification without Disparate Mistreatment," in Proceedings of the [34] F. Tramer, V. Atlidakis, R. Geambasu and D. Hsu, "FairTest:
26th International Conference on World Wide Web, International World Discovering Unwarranted Associations in Data-Driven Applications,"
Wide Web Conferences Steering Committee, 2017, pp. 1171--1180. 2015. [Online]. Available: https://github.com/columbia/fairtest.
[Accessed August 2019].
[13] K. Holstein, J. Wortman Vaughan, H. Daume III, M. Dudik and H.
Wallach, "Improving Fairness in Machine Learning Systems: What Do [35] J. A. Adebayo, "FairML: ToolBox for Diagnosing Bias in Predictive
Industry Practitioners Need?," in Proceedings of the 2019 CHI Modeling," 2016. [Online]. Available:
Conference on Human Factors in Computing Systems, 2019. https://github.com/adebayoj/fairml. [Accessed August 2019].
[14] N. Grgic-Hlaca, M. B. Zafar, K. P. Gummadi and A. Weller, "The Case [36] N. Bantilan, "Themis-ml: A Fairness-aware Machine Learning Interface
for Process Fairness in Learning: Feature Selection for Fair Decision for End-to-end Discrimination Discovery and Mitigation," 2017.
Making," in NIPS Symposium on Machine Learning and the Law, 2016. [Online]. Available: https://github.com/cosmicBboy/themis-ml.
[Accessed August 2019].
[15] I. Zliobaite, "A Survey on Measuring Indirect Discrimination in Machine
Learning," arXiv preprint arXiv:1511.00148, 2015. [37] H. Nori, S. Jenkins, P. Koch and R. Caruana, "InterpretML: A Unified
Framework for Machine Learning Interpretability," 2019. [Online].
[16] A. Krishnakumar, "Assessing the Fairness of AI Recruitment systems," Available: https://github.com/microsoft/interpret. [Accessed August
2019. 2019].
[17] Z. Zhong, "A Tutorial on Fairness in Machine Learning," Medium, 21 [38] M. T. Ribeiro, S. Singh and C. Guestrin, "Why Should I Trust You?:
October 2018. [Online]. Available: https://towardsdatascience.com/a- Explaining the Predictions of Any Classifier," 2016. [Online]. Available:
tutorial-on-fairness-in-machine-learning-3ff8ba1040cb. https://github.com/marcotcr/lime. [Accessed August 2019].
[18] M. Hardt, "CS 294: Fairness in Machine Learning," August 2017. [39] S. M. Lundberg and S.-I. Lee, "SHAP," 2017. [Online]. Available:
[Online]. Available: https://fairmlclass.github.io/. [Accessed August https://github.com/slundberg/shap. [Accessed August 2019].
2019].
[40] B. Bengfort, R. Bilbro, N. Danielsen, L. Gray, K. McIntyre, P. Roman,
[19] EEOC, "Adoption of Questions and Answers To Clarify and Provide a Z. Poh and et.al, "Yellowbrick," 14 November 2018. [Online]. Available:
Common Interpretation of the Uniform Guidelines on Employee http://www.scikit-yb.org/en/latest/. [Accessed August 2019].
Selection Procedures," Eeoc.gov, 1979. [Online]. Available:
https://www.eeoc.gov/policy/docs/qanda_clarify_procedures.html. [41] TeamHG-Memex, "ELI5," 2016. [Online]. Available:
[Accessed September 2019]. https://github.com/TeamHG-Memex/eli5. [Accessed August 2019].
[20] M. B. Zafar, I. Valera, M. Gomez-Rodriguez and K. P. Gummadi, [42] Oracle, "Skater," 2019. [Online]. Available:
"Fairness Constraints: A Flexible Approach for Fair Classification.," https://github.com/oracle/Skater. [Accessed August 2019].
Journal of Machine Learning Research, vol. 20, no. 75, pp. 1--42, 2019.
[43] CSIRO's Data61, "StellarGraph Machine Learning Library," 2018.
[21] J. E. Hunter, "Cognitive Ability, Cognitive Aptitudes, Job Knowledge, [Online]. Available: https://github.com/stellargraph/stellargraph.
and Job Performance," Journal of vocational behavior, vol. 29, no. 3, pp. [Accessed August 2019].
340--362, 1986.
[44] M. T. Ribeiro, S. Singh and C. Guestrin, "Anchors: High-Precision
[22] M. R. Barrick and M. K. Mount, "The Big Five Personality Dimensions Model-Agnostic Explanations," 2018. [Online]. Available:
and Job Performance: A Meta-Analysis," Personnel psychology, vol. 44, https://github.com/marcotcr/anchor. [Accessed August 2019].
no. 1, pp. 1--26, 1991.
[45] ModelOriented, "DALEX," 2019. [Online]. Available:
[23] N. Grgić-Hlača, M. Zafar, K. Gummadi and A. Weller, "Beyond https://github.com/ModelOriented/DALEX. [Accessed August 2019].
Distributive Fairness in Algorithmic Decision Making: Feature Selection
for Procedurally Fair Learning," in Thirty-Second AAAI Conference on [46] H2O.ai, "Machine Learning Interpretability (MLI)," 2017. [Online].
Artificial Intelligence, 2018. Available: https://github.com/h2oai/mli-resources. [Accessed August
2019].
[24] S. Verma and J. Rubin, "Fairness Definitions Explained," in 2018
IEEE/ACM International Workshop on Software Fairness (FairWare), [47] P. Saleiro, B. Kuester, A. Stevens, A. Anisfeld, L. Hinkson, J. London
2018. and R. Ghani, "Aequitas: A Bias and Fairness Audit Toolkit," 2018.
[Online]. Available: https://github.com/dssg/aequitas. [Accessed August
[25] C. Dwork, M. Hardt, T. Pitassi, O. Reingold and R. Zemel, "Fairness 2019].
Through Awareness," in Proceedings of the 3rd innovations in
theoretical computer science conference, 2012.
[26] A. B. Holm, "E-recruitment: Towards an Ubiquitous Recruitment

Authorized licensed use limited to: George Mason University. Downloaded on April 06,2024 at 14:07:32 UTC from IEEE Xplore. Restrictions apply.
[48] P. Adler, C. Falk, S. A. Friedler, T. Nix, G. Rybeck, C. Scheidegger, B. [51] A. Zagalsky, J. Feliciano, M.-A. Storey, Y. Zhao and W. Wang, "The
Smith and S. Venkatasubramanian, "Auditing Black-box Models for Emergence of GitHub as a Collaborative Platform for Education," in
Indirect Influence," 2018. [Online]. Available: Proceedings of the 18th ACM Conference on Computer Supported
https://github.com/algofairness/BlackBoxAuditing. [Accessed August Cooperative Work & Social Computing, 2015.
2019].
[49] Microsoft, "Reductions for Fair Machine Learning," 2016. [Online].
Available: https://github.com/microsoft/fairlearn. [Accessed August
2019].
[50] R. Elshawi, Y. Sherif, M. Al-Mallah and S. Sakr, "Interpretability in
HealthCare A Comparative Study of Local Machine Learning
Interpretability Techniques," in 2019 IEEE 32nd International
Symposium on Computer-Based Medical Systems (CBMS), 2019.

Authorized licensed use limited to: George Mason University. Downloaded on April 06,2024 at 14:07:32 UTC from IEEE Xplore. Restrictions apply.

You might also like