In 1986, when Elie Wiesel, the Holocaust survivor accepted the Nobel Peace Prize, he told the assembled audience, ‘We must take sides. Neutrality helps the oppressor, never the victim. Silence encourages the tormentor, never the tormented.’
This could be a lesson for all of those involved in data science and machine learning (ML). The tools being built are not neutral. They can be, and often are, unfair in all sorts of ways. ML automates the process of discovering patterns and regularities in data. It makes predictions about an outcome of interest based on structures within data sets. Outcomes are treated as a function of a set of observed characteristics. What makes ML different – from human decisions is the complexity of the functions and the enormous number of characteristics those functions take into account.
According to one view, all ML is discrimination. The purpose of training models on enormous data sets is to identify statistical bases on which to distinguish between outcomes of interest. ML is best thought of not as a single technology, but more as a set of techniques and methods for discrimination – a way of solving problems of prediction or choice.
Thinking in terms of techniques draws attention to the fact that ML is a process. As Cynthia Dwork puts it, ‘both the data and the algorithms’ reflect choices: ‘about data, connections, inferences, interpretations, and thresholds’. These choices can be broken down into the following three categories:
ML is about turning a vague problem in the real world into a specific question about the value of a target variable. To take an obvious example, suppose we are building a model to evaluate an individual’s creditworthiness. What makes someone creditworthy? Creditworthiness is a contextual concept, defined by the risk appetite of various financial institutions.
We cannot decide which interpretation is correct simply by looking at the data. What matters is domain knowledge. We have to understand the task at hand to ask the difficult questions: Why are we seeking to identify creditworthiness? What do we mean by it? How does the computer operationalize this definition? In doing so, how might it build in unfairness?
Data Generation Process
Training data can itself be generated by an unfair process. When a model is trained on this data, it will likely learn to propagate this unfairness. For instance, comments which use the N-word may have incorrectly been labelled as hate speech, whereas in fact, the presence of the N-word is not always an indicator of hate speech. Whether the use of the N-word constitutes hate speech depends on the context within which it is used.
Representativeness of Data
Another way training data can generate unfairness in a model is when the data has insufficient representation across two groups.
Insufficient Features in Data
If models are trained on data sets which have limited features, these features can carry more meaning for one group than another. For instance, consider a hiring algorithm that was trained on data which included information about whether employees took a leave of absence, but not whether that leave was voluntary (such as parental leave) or involuntary (for underperforming employees). The resulting model may unfairly discriminate against parents who have recently taken a leave of absence.
Model includes membership of sensitive groups
If a model includes membership of sensitive groups, it may learn to treat individuals from those groups differently. If an individual were a member of a different sensitive group, they might receive a more favorable classification or treatment which would be straightforward discrimination.
Much more common – and more complex – is when a model is trained on data that includes proxies for sensitive group membership. Information about an individual’s race or class or gender can be, and often is, redundantly encoded in a data set.
Data can and often does reflect underlying injustices in society, in this case, ML models can unwittingly reflect underlying injustice that exists out there in the world. But as soon as those models make choices – predictions, classifications, or rankings – those models reinforcethat existing injustice. To put this in Elie Wiesel’s terms: in an unfair world, neutrality is an illusion. Neutrality reinforces unfairness.
What matters is to be clear about the choices made at each stage of the ML process: to be clear when choices are made, to know which choices are made, and to know why they are made.
The pie chart of terms as shown below refers to several different things. The warm colors all represent terms that are focused on outcomes, that is, on the consequences of certain decision processes or choices (consequentialist concepts). The colder colors all represent terms that are focused on the decision process itself, whether it follows certain desired rules or constraints (deontological concepts).
Based on this chart, here are some over-arching, general, and entirely fallible definitions:
Equality: Ensuring that all individuals or groups have access to the same resources.
Equity: Ensuring that all individuals or groups have the same opportunity to achieve the same outcomes. This might require giving unequal resources to individuals or groups who start from a disadvantaged position.
Fairness: Treating individuals or groups with an even hand. Similarly situated individuals or groups should be treated similarly.
– Procedural Fairness: Decisions or choices must treat individuals or groups with an even hand. Similarly situated individuals or groups should be treated similarly.
– Substantive Fairness: Decisions or choices must actively promote – not undermine – equal opportunities for similarly situated individuals or groups.
Justice: An individual’s success and welfare in life should follow from their own efforts and choices, not morally arbitrary features of their environment (birth, gender, race, neighborhood, etc).
Given the technology’s role in society, fairness raises the following open question: Can we focus on fairness in ML without also focusing on justice?
Here are four definitions of fairness drawn from the computer science literature, yet still being grounded on a common and intuitive philosophical definition of fairness.
Fairness as Blindness
No decision should be made on the basis of a morally irrelevant trait. Fairness should be blind with respect to protected traits. (See: Nina Grgic-Hlaca et al.,
“The Case for Process Fairness in Learning: Feature Selection for Fair Decision Making,” Symposium on Machine Learning and the Law(2016).)
Decision is fair to an individual if it is the same in (a) the actual world and (b) a counterfactual world where the individual belonged to a different demographic group. (See: Kusner, Matt J., Joshua R. Loftus, Chris Russell, and Ricardo Silva. “Counterfactual Fairness” (2017))
One’s membership of a social group – e.g. race or gender – shouldn’t affect the outcome of the decision. (See: Moritz Hardt, Eric Price, and Nathan Srebro, “Equality of Opportunity in Supervised Learning,” October 7, 2016.)
A decision should give similar predictions to similar individuals. (See: Dwork et al., “Fairness through Awareness”. (2012)
Not all these fairness conditions can be satisfied at once. For instance, the prevalence of recidivism differs across racial groups. One cannot therefore satisfy both the equality of opportunity constraint and predictive parity across those groups.
This suggests a question: Should you aim at equal false positive and false negative rates, but abandon predictive parity? The answer, of course, is that it depends on context.
Individual life chances are profoundly shaped by a whole range of factors beyond their control, fixed by virtue of a person’s circumstances and not by their choices. The tools do not exist in a vacuum: they influence the world they are a part of. Another way of putting this, in terms of the definitions above, is that it might not be possible to be at once procedurally and substantively fair.
Given this, there are two central questions to think about when evaluating the fairness of a model:
- Which fairness constraint should I favor? What does this algorithm classify/rank/allocate? b. Who does this algorithm affect? Is race likely to matter here? Or gender? Or political ideology? c. How do the choices this algorithm will make shape the world?
- Do I want to achieve fairness only? Or do I also need to think about justice? Am I satisfied if the model is procedurally fair? Will procedural fairness be substantively unfair? b. Do I want to combat rather than reflect injustice? c. Do I want to reflect an unfair world or change it?
“…and let not hatred of a people incite you not to act equitably; act equitably, that is nearer to piety, and be careful of (your duty to) God; surely God is aware of what you do.” (Qur’an, 5:8)