Teaching Machines to Play Fair

Machine Learning

By Daniel McNamara, 2016 Fulbright Postgraduate Scholar in Computer Science

I had the good fortune to visit the Machine Learning Department at Carnegie Mellon University in Pittsburgh as a Fulbright Postgraduate Scholar in 2016-17. During my eight months there I had the opportunity to learn from a technically brilliant group of academics and students who are designing algorithms that will shape the future.

As part of my Fulbright program, I participated in an Enrichment Seminar on Civil Rights in the United States. The seminar was held in Atlanta – birthplace of the Civil Rights movement and to this day an important centre of African-American political activity. More than 100 Fulbrighters from around the world heard from African-American leaders and activists, visited the Civil Rights Museum and Martin Luther King’s tomb, and ran a story-telling workshop at a local school.

Algorithmic decision-making and fairness

Joining the dots between seemingly disparate subjects – the latest technical developments in machine learning and the political struggle for racial equality in the US – I became interested in the challenge of ensuring that decisions made by machines are fair. This has become an important and rapidly emerging field of research which is attracting attention from scholars in machine learning, law, philosophy and other disciplines. And everyone else – whose lives are increasingly influenced by decisions made by algorithms – are keenly awaiting solutions.

Machine learning systems are now widely used to make all kinds of decisions about people’s lives – such as whether to grant someone a loan, interview someone for a job or provide someone with insurance. There is a risk that these algorithms may be unfair in some way, for example by discriminating against particular groups. Even if not intended by the algorithm designer, discrimination is possible because the reasoning behind the algorithm’s decisions is often difficult for humans to interpret. Furthermore, artefacts of previous discrimination present in the data used to train the algorithm may increase this tendency in the algorithm’s decisions.

An interesting example is the use of recidivism risk scores, which are commonly used in criminal sentencing in the United States. The news organisation ProPublica investigated one widely used system, COMPAS. The investigation showed that black defendants received higher risk scores than white defendants. Furthermore, the risk scores made different kinds of errors for defendants of different races. Black defendants were overrepresented among the false positives: people who were given a high risk score but did not re-offend for the next two years. Conversely, white defendants were overrepresented among the false negatives: people who were given a low risk score but did in fact re-offend.

Baking in fairness

To remove or minimise discrimination effects caused by the use of machine learning systems, fairness may be ‘baked in’ to algorithm design. This approach benefits users of such algorithms, particularly those in social groups that are potentially the subject of discrimination. Moreover, as rapid technological progress drives disruptive social change and in turn resistance to such change, such design decisions will be required to maintain the ‘social license to operate’ of companies using the algorithms.

Incorporating fairness into algorithm design also has a role to play in the effective regulation of algorithms. For example, fairness is considered in the European Union’s General Data Protection Regulation, which comes into force this year. Paragraph 71 of the preamble states:

In order to ensure fair and transparent processing … the controller should…  secure personal data in a manner … that prevents, inter alia, discriminatory effects on natural persons on the basis of racial or ethnic origin, political opinion, religion or beliefs, trade union membership, genetic or health status or sexual orientation.

The thorny question remains: how do we define fairness? Machine learning researchers are now joining a discussion previously led by philosophers and legal scholars. A seminal study led by the Harvard computer scientist Cynthia Dwork proposed two notions of fairness. Group fairness means making similar decisions for one group compared to another, while individual fairness means making similar decisions for individuals who are similar. The two notions of fairness are potentially in tension: group fairness promotes equal outcomes for each group regardless of the characteristics of the individuals that make up the groups, while individual fairness provides individuals who are similar with equal treatment regardless of their group membership.

Transforming data to achieve fairness

Machine learning algorithms learn from previous examples. One approach to ensuring fairness is to transform the data accessed by the algorithm. The idea is to remove information about group membership (e.g. gender, race) from the data to protect particular groups from discrimination. However, it is typically not as simple as removing a single column in the data, since it may be possible to infer group membership from the other columns. A classic example is the historic practice of ‘redlining’, where decision-making based on loan applicants’ neighbourhood was used as a proxy for racial discrimination.

Machine learning methods such as neural networks may be used to transform data so that membership of a particular protected group can no longer be inferred. For a given data transformation, it is possible to quantify the extent to which it improves group fairness, as well its impact on individual fairness and the usefulness of the transformed data. This kind of guarantee can allow a regulator to approve the transformed data prior to its use in decision-making. Even if the decision-maker attempts to discriminate against a particular protected group, their ability to do so will be limited because the required information has been removed by the data transformation.

Code-ready fairness

Traditionally, fairness has been ‘codified’ in rather general terms in our legal system, and enforcement has relied upon a common understanding of fairness across society. But computers lag behind us on qualitative reasoning abilities. Our new challenge is to provide definitions of fairness that are precise enough to be embedded in computer code.