Regulated Algorithms - Do we Need Them?

Have you heard of an algorithm called PredPol? COMPASS? VALCRI? They are all used in the criminal justice system. COMPASS predicts the likelihood of a criminal reoffending and is used by judges in US courts to help decide on a sentence. PredPol and VALCRI are machine learning programs used to apprehend and solve crimes. Given a million past police reports, PredPol will tell you where and when the next crime is likely to take place. Given a million case histories and the evidence in your particular case, VALCRI will literally tell you whodunnit.

Read the headlines and you will see that algorithmic decision-making systems like these are becoming increasingly present across society, not just in crime fighting. And this may well be a good thing. At the very least, we might see some innovation in the plot writing of murder mysteries. But it’s not an open and shut case.

In 2016, investigative journalists ProPublica found that COMPASS’ successful prediction rates were as low as 20% in certain cases. The most worrying side to this discovery was that there were racial discrepancies. Black defendants were incorrectly flagged as future offenders almost twice as frequently as white defendants, and white defendants who would actually go on to re-offend were flagged as low risk more frequently than black offenders. The implication here is that offenders have been given punitive sentences based on an algorithm that, when compared to the human idea of ‘fairness’, does not judge fairly.

The plot thickens. A study found that using predictive policing systems such as PredPol can lead to discriminatory policing. Because of the data used to train them - police reports - programs like these can be more likely to predict crimes in areas that historically have been included in reports the most. If the catalog of police reports was a full record of every crime that had taken place in the area then this would be fine. But this is not the case. Due to a bias in the reporting of incidents, the causes of which are part of another discussion entirely results from simulated scenarios suggested that the end result could be higher levels of policing in minority communities with discriminatory consequences for minority individuals.

The list goes on and extends beyond injustice in the legal system. Take facial recognition software, used in everything from the latest iPhone to security at the next Olympics. A study from MIT University found that three commercial available facial recognition tools - those from IBM, Microsoft, and Megvii - struggled to accurately determine the genders of dark skinned females. Rates of incorrect classification were as high as 46.8% in one system - almost the equivalent of just guessing. On the flipside, which by now you might be able to guess, is that in all systems the error rate was no higher than 0.8% for light-skinned men.

So as the prevalence of machine learning algorithms in our lives grows, it would be nice to have a system in place that doesn’t rely on truth-seeking journalists and academics. But how to start?

The word transparency comes to mind: “show us your code and we’ll trust it!”. But with so much competition in the industry, this is unlikely to gain much traction with businesses. However, with GDPR now in full flow, companies using human data will now have to allow people their ‘right to explanation’. What this means for machine learning is a difficult topic. For instance, how do you explain how a deep learning algorithm has come to a particular decision when it has been taught on a huge mass of data and then left to its own devices? Would such an explanation even be useful?

Another avenue to explore is the idea of a regulatory body that would somehow audit these programs. In the same way, we trust the Financial Conduct Authority to watch over our financial system, we could trust this body to watch over the algorithms shaping our lives. It would mean more than just checking for malevolent lines of code. As with the examples I have given in this article, many issues arise from the data sets used to train an algorithm, so these would need to be audited as well. We might have to wait a while before the ICO have time to look at the feasibility of such an organisation.

So far, no one has tried to end the world with a doomsday deep learning algorithm. But some programs, created for good, have ended up doing some unintended damage. As we see more of these programs borne into society, encompassing everything from the phones in our pockets to the security cameras watching us in the street, it would be nice to know that someone is keeping them in check.

by Arthur Duffen

Posted on October 02, 2018