These issues arise from a constellation of causes. Some are underlying social roots; if you train a machine learning algorithm on data created by biased humans, you’ll get a biased algorithm. Some are simply statistical artifacts; if you train a machine learning algorithm to find the best fit for the overall population, to the extent that minorities are different in a relevant way, their classifications or recommendations will necessarily have poorer fit. And some are a combination of the two: biased humans lead to biased algorithms that make recommendations which reinforce unjustified stereotypes (for instance, harsher policing of poorer neighborhoods leads to more crime reports in those neighborhood. More crime reports trigger policing analytics to recommend deploying more cops to those neighborhoods, and voila! You have a nasty feedback loop). The trouble is that it’s not at all clear how to make algorithms fair. And in this regard, conversations about algorithmic fairness have been a magnifying mirror on society’s ethics. Debates over how to define and measure fairness reflect broader ethical conversations taking place today.
Three conceptualizations of fairness
Certain group labels should be off limits. This mode of thought maintains that algorithms should not be allowed to take certain protected categories into account when making predictions. For instance, in this view, algorithms which are used to predict loan qualifications or recidivism should not be allowed to base predictions off of race or gender. This approach to achieve fairness is straightforward and easy to understand. But there are two main problems:
1. Distinguishing between acceptable and unacceptable proxies of protected categories. Even when such categories are eliminated from an algorithm, the statistical variance explained by these protected categories tends to slip into other available variables. For instance, while race might be excluded from loan applications, zip code, which tends to be highly correlated with race, can take on a higher predictive weight in the model and mask discrimination. For all intents and purposes, zip code becomes the new race variable. It’s challenging and debatable which proxies are illegitimate substitutes for protected categories, and which are acceptable, distinct variables. This fuzzy line brings us to the other problem with making certain labels “off-limits.”
2. The societal (and sometimes personal) costs are high. Protected categories often can make a meaningful impact on the behaviors that the algorithms are designed to predict. For instance, it is commonly known that insurance premiums are higher for male drivers, because male drivers really do account for more of the total insurance payouts. Eliminating gender from these algorithms would cause car insurance premiums to decrease for men, but it would increase the rates for women. Whether or not women should be required to pay for more than their share of risk, such that gender is eliminated from risk algorithms, is debatable. In short, while this may create exact equality, this seems to be missing the mark of what is proportionally equitable. Some would argue this approach is actually unfair.
Most people agree (including the law) that basing decisions on protected categories when there is no tangible justification is morally reprehensible. The tough part is when using these protected categories appears to efficiently cut down harmful outcomes. This trade off has led some to take alternative approaches to defining fairness algorithmically. Is there a way to maximize predictive accuracy (allowing inclusion of meaningful protected categories), while still being fair?
Algorithmic performance should work equally well across certain groups. As opposed to ignoring protected categories like race and gender (e.g. being color or gender blind), this approach to fairness instead argues that indicators of an algorithm’s performance should be equivalent across the protected categories. For example, an algorithm which classifies criminals as either high or low risk of re-offending should make prediction errors equally for white and black criminals. This approach is less intuitive than the color-blind approach, but at least theoretically allows the algorithms to be more efficient in their predictions, and has the added perk of avoiding tricky judgments calls about which proxies (e.g. zip code as a crude substitute for race) are and aren’t acceptable for inclusion in algorithms.
Still, this approach is imperfect. To see why, it’s important to understand that different groups of people will represent distinct populations—populations with different average scores, deviations, skews, kurtosis, etc. (see image above, and imagine trying to get one algorithm to perform equally for each group curve using the same cutoff threshold) . Generally, when we speak about fairness, we want all people, regardless of their group membership, to be held to the same standards. But if the same cutoff thresholds are used for different populations, predictive ability and error rates are more than likely to differ across groups–this is simply the natural result of how statistics works. If government regulation compels corporations to turn out algorithms that maintain the same performance across protected groups, corporations and institutions are incentivized to discriminate under the obscuring power of statistical wizardry and employee NDAs.
Algorithmic scores should represent the same things across members of different groups. A third approach to achieving fairness in algorithms, is to ensure that an algorithm’s scores mean equivalent things across protected categories (for instance a woman receiving a risk score of X on her insurance application, should have similar insurance payouts as a man who also receives a risk score of X on his insurance application). On the surface, it would seem that this approach is getting at what we want—it seems fair. The problem is that it cannot guarantee fairness in the presence of intentionally discriminatory action, and thus regulation of algorithms on the basis of this definition of fairness will still leave room for obscured discriminatory treatment. There are at least two ways this can happen:
1. Proxies (like zip code for race) can still be used to gerrymander population scores above or below an algorithm’s cutoff thresholds. For example, individuals at a higher risk of loan defaulting can be paired with individuals at a lower risk of loan defaulting, such that a protected category’s risk scores can be pushed above or below a cutoff threshold at will. This essentially boils down to algorithmic redlining.
2. As discussed above, different groups will have different statistical risk curves. If quantitative scores are discretized (for instance, substituting “high,” “medium,” or “low” labels in place of an individual’s exact score) within groups, these differences in the real risk curves can mask different group cutoffs while maintaining the veneer that individuals labeled “high” risk re-offend, default, and get in car crashes at similar rates across protected (race, gender, etc.) categories. For example, in the image above, assigning a person a “high,” “medium,” or “low” risk label on the basis of their within-group percentile will effectively yield different group cutoff thresholds, while potentially maintaining the same algorithmic performance across those labelled “high” risk for each protected group.
Each approach to algorithmically defining fairness has its strengths and weaknesses. I think what’s most troubling is not so much the weaknesses that each approach faces, but instead that these approaches are fundamentally incompatible with one another. We cannot ignore protected categories while using protected categories as the baseline to detect fairness. And we can’t demand similar algorithmic error rates while demanding that similar risk scores actually do entail similar outcomes among groups. The race is still on to define fairness algorithmically. But my background in moral psychology also gives me pause. Democrats, Republicans, and Libertarians can’t agree on what’s fair, and I think it’s too optimistic to treat algorithmic fairness like a mathematical, computer science problem. The trouble isn’t solving some complicated statistical rubix cube, so much as it is trying to manifest Plato’s perfect form of fairness on a cave wall that’s only capable of capturing shadows. It’s hard to predict which solutions we’ll embrace, and what the costs will be when those solutions interact with regulatory and economic incentives. Algorithmic fairness is, at its heart, a socio-moral problem.