More and more companies are automating processes with the help of ML. This has tremendous advantages since an algorithm can scale in an almost unlimited fashion and in case rich datasets are available, can often detect patterns in data that humans cannot discover. As a well-known illustration, think about how Deepmind used a ML algorithm to optimize energy consumption of Google's data centers. When algorithms are used to make important decisions that impact people's lives, such as deciding on medical treatment, granting a loan, or performing risk assessments in parole hearings, it is of paramount importance that the algorithm is fair. Because the models become ever more complicated, this is however not easy to assess. As a consequence, both the public, legislators and regulators are aware of this issue; see e.g. a report on algorithmic systems, opportunities and civil rights by the Obama administration and (in Europe) recital 71 of GDPR.
In order to detect bias/unfairness it is important to come with a proper (mathematical) definition to make sure that we can measure deviations. Below we describe a few alternatives.
Let us assume we are building a model to determine whether somebody may receive a loan. Typically such model will use information (so-called attributes) such as your credit history, marital status, education, profession, etc. in order to estimate the probability that you will be able to pay off your debts. The dataset will also include historical data on defaults (i.e. people who were not able to repay their loan). Now, given the above, the financial institution wants to make sure that the algorithm is fair with respect to gender, skin colour, etc. These are called protected attributes.
A traditional solution is to simply remove these protected attributes from the dataset altogether. Of course, when testing such model, simply changing the value of any of the protected attributes is not going to impact to output of the model. However, through so-called redundant encodings an algorithm might be able to guess the value of the protected attributes from other information. As an example, let us assume we train our credit model on a dataset representing the people of Verona. The dataset has a protected attribute called ancestor that can take two values (Capulet and Montague). If people whose ancestor attribute equals Capulet live primarily in the Eastern part of the city while the Montagues live in the Western half, then our algorithm can infer the ancestor attribute (statistically) by looking at the address. Hence, simply removing the ancestor attribute from the dataset does not help.
Another approach is so-called demographic parity. In that case one requires that the membership of a protected attribute is uncorrelated with the output of the algorithm. Let us assume that granting a loan is indicated by a target binary variable Y = 1 (the ground truth) and that the protected (binary) attribute is called A. The forecast of the model is called Z. Demographic parity then means that
Pr[ Z = 1 | A = 0] = Pr[ Z=1 | A=1]
This notion however is also flawed. A key issue with the approach is that if the ground truth (i.e. default in our example) does depend on the protected attribute, the perfect predictor (i.e. Y=Z) cannot be reached and therefore the utility and predictive power of the model reduce. Moreover, by requiring demographic parity, the model has to yield (on average) the same outcome for the different values of the protected attribute. In our example, demographic parity would imply that the model would have to refuse good candidates from one category and accept bad candidates from the other category in order to reach the same average level. Concretely, assuming that the Montagues historically have defaulted on 3% of their loans while the Capulets only on 1%, demographic parity would typically be in the disadvantage of the Capulets.
A more subtle suggestion for fairness was proposed by Hardt et al:
Pr[Z=1 | Y=y, A=0] = Pr[Z=1 | Y=y, A=1]
for y=0,1. In other words, relatively speaking the model has to be right (or wrong) as often for either value of the protected attribute. This definition incentivises more accurate models and the property is called oblivious as it depends on the joint distribution of A,Y and Z.
In this short post, we have reviewed a few different notions of fairness, focussing mostly on the simplest case where the protected attribute and the ground truth are both binary variables. The oblivious property introduced above is an interesting approach that can be easily computed. The simple (binary) set-up we discussed can be extended to the multinomial and continuous case and as such can serve as an accurate test to detect fairness in algorithms.