8

 

Professor Nigel Harvey (Professor of Judgment and Decision Research at UCL London) and Mr Tobias Harvey (Student of Law at Kings College London) Written evidence (NTL0025)
 

1. Witness Background

1.1. Nigel Harvey is Professor of Judgment and Decision Research in the Department of Experimental Psychology at University College London. He is a past-president of the European Association for Decision Making (EADM) and has served on the editorial boards of the International Journal of Forecasting, the Journal of Behavioral Decision Making and Futures and Foresight Science. Much of his work has dealt with the use of judgment in forecasting. His current research focusses on how people combine their judgment with forecasts produced algorithmically, on the effectiveness of the resulting forecasts, and on ways of improving that effectiveness. He acted as an advisor for the New Zealand Law Foundation’s 2017-2019 project on Artificial Intelligence and Law in New Zealand. It is his knowledge of this area that prompted this submission of written evidence to the inquiry.

1.2. Tobias Harvey is a student of law at King’s College London.

2. Executive summary

There are problems associated with applying advanced algorithmic tools based on machine learning approaches to legal judgment and decision making. They include promulgation of existing biases and lack of transparency. These and other problems, together with some potential solutions to them, are considered by the New Zealand Law Foundation’s project Artificial Intelligence and Law in New Zealand. We refer the Committee to their reports and associated publications and briefly summarize some of their main points.

3. Types of algorithm: An important distinction

3.1. Algorithms are sets of rules that are designed to produce desired outcomes. They are typically implemented by computers and, broadly speaking, are of two types. In conventional (non-advanced) algorithms, the rules are explicit within the computer programme. As a result, the rationale for decisions made by such algorithms is transparent.

3.2. More recently, computers have been programmed with more advanced algorithmic tools. These include learning algorithms that use machine learning techniques - often regarded as a type of artificial intelligence (AI). They use information extracted from large historical data sets to identify relations between input variables (e.g., features of a crime and an individual convicted of it) and output variables (e.g., the sentence given). These learning algorithms (sometimes termed neural networks) use parallel distributed processing techniques that are elaborations of those originally developed during the 1980s. They make decisions in accordance with statistical relations within the historical data used to train them. However, it is difficult to provide reasons for why they treat a particular case in a particular way: currently, all but the simplest algorithms of this type lack the transparency that is seen as important by those who are considering using them.

3.3. Explicit algorithms may be difficult to formulate because people vary in their opinions about the rules that should be included in them. However, once those rules are agreed, decisions those algorithms make are transparent. In contrast, learning algorithms develop their own rules for decision making by analysing historical data derived from decision-makers’ past behaviour: there is little room for debate about what those rules should be. However, the decision making rules extracted by learning algorithms will rarely be transparent to those using them. They may also promulgate undesirable features of past decision making (e.g., certain types of prejudice) that are present in the historical data.

3.4. This call for evidence focusses on the second class of algorithms, advanced approaches that include learning algorithms based on machine learning and related AI techniques. However, in discussing them, it will be occasionally helpful to compare their features with those that characterize explicit algorithms.

4. Algorithms in legal settings

4.1. Like all judgments, those made within legal contexts suffer from two undesirable features: bias and inconsistency. Bias refers to systematic differences between judgments that are made and those that should be made according to recognised criteria or guidelines (e.g., based on fairness). Inconsistency refers to non-systematic differences in judgment: for example, apparently random differences in judgments made in response to the same case by different people or by the same person at different points in time.

4.2. Empirical investigation of these problems within legal contexts has focussed primarily on bail and sentencing decisions.  For example, within the English legal system, magistrates’ bail decisions are governed and guided by the Bail Act of 1976 (Cavadino & Gibson, 1993) and its subsequent revisions. However, Dhami & Ayton (2001) consider the law vague and ill-defined because, though it lists factors that magistrates should ‘have regard to’ when assessing whether a defendant will abscond, offend, or obstruct justice, it also indicates that they may take into account any other factors that ‘appear to be relevant’. Given this, it is perhaps unsurprising that, in their study, lay magistrates making bail decisions for the same case on two occasions varied a lot in their consistency and that there was disagreement between magistrates on 25 of the 41 cases they examined. Furthermore, there was disagreement between magistrates’ assessments of how important they thought that various factors should be in their decision making and the importance that they gave those factors when actually making their decisions. For example, magistrates placed much more emphasis on the race and gender of the defendant when actually making the bail decisions than they thought was warranted when asked to assess the importance of those factors. 

4.3. Elimination of inconsistency in judgments provides a strong argument in favour of use of algorithms. But what about their effects on bias? Learning algorithms based on historical data would preserve bias: in the above example, they would place just as much emphasis on race and gender as real magistrates do when actually making bail decisions. As a result, the amount of emphasis that they place on those factors would be greater than those same magistrates would consider to be appropriate. (In contrast, explicit algorithms could be engineered to ensure that the extent to which those factors are taken into account when bail decisions are made corresponds to the degree to which lawyers consider they should be taken into account.)

5. The New Zealand Law Foundation’s project on Artificial Intelligence and Law in New Zealand

As far as we are aware, New Zealand is the first country to consider potential problems associated with introducing advanced algorithmic techniques into legal judgment and decision processes and to suggest some possible approaches to tackling them. The similarity of New Zealand’s legal system to the English one suggests that their report is relevant to issues that use of algorithms raise in this country. For fuller descriptions of their work, we refer the Committee to their reports and other publications arising from them. In particular: 

Gavaghan et al. (2019). Government use of artificial intelligence in New Zealand NZLF-report.pdf (data.govt.nz)

New Zealand Government (2020) Algorithm-Charter-2020_Final-English-1.pdf (data.govt.nz)

Stats NZ (2018). Algorithm assessment report Algorithm assessment report - data.govt.nz

Lidicoat et al. (2020). The use of algorithms in the New Zealand public sector No Job Name (otago.ac.nz)

Relevant publications by John Zerelli, one of the co-ordinators and main researchers on the New Zealand Law Foundation project (John Zerilli | Oxford Law Faculty)

In what follows, we briefly summarize the main issues identified by the NZ Law Foundations project.

5.1. NZ algorithmic stocktake

In 2018, the NZ government documented use of 32 different algorithms in various departments and agencies. Those employed in the criminal justice system included the Youth Offending Screening Tool (YORST), the Risk of Re-conviction x Risk of Re-imprisonment (RoC*Rol) predictive model, and the revised Automated Sexual Recidivism Scale (ASRS-R). These can be characterized as conventional (non-advanced) algorithms. At the time of the stocktake, advanced algorithms were not used within the NZ criminal justice system but some were being used in other countries (including the UK and USA). These fell into four broad categories: predictive policing (e.g., PredPol), crime detection (e.g., VALCRI), prosecution decisions (e.g., HART), and post-conviction decisions, including sentencing, parole, and post-sentence detention (e.g., COMPASS). Many of these advanced systems have been developed by private companies. As a result, though they are easily available (to agencies with the required funds), commercial sensitivity compounds their complexity and opacity with secrecy: the system’s code cannot be scrutinised by those using them within the criminal justice system or by those affected by them (e.g., prisoners and their lawyers).

5.2 Features of algorithms

5.2.1. Accuracy Algorithms can take more factors into account in a more systematic way than human judges can. They are less distracted by irrelevant factors. However, because of idiosyncratic correlational features of datasets, actual levels of accuracy may not be high (Larsen et al. 2016) and may be only somewhat better than human judgment. As a result, the NZ Algorithm Assessment report states: “Almost all participating agencies use operational algorithm to inform human decision making, rather than to automate significant decisions. Where decisions are automated, these usually relate to automatic approvals or opportunities for people. None of the participating agencies described a circumstance where a significant decision about an individual that was negative, or impacted entitlement, freedom or access to a service was made automatically and without human oversight”.

5.2.2. Objectivity Advanced algorithms are scientifically validated tools that are objective in a manner that unaided human judgment is not. However, the designers of algorithms are human: it is at that stage that human judgment enters into algorithmic decision making. For example, in developing an algorithm to aid the amelioration of homelessness, the data scientist has to decide how best to characterize homelessness in order to measure it. Also, the output of an algorithm may be a probability (e.g., of specified adverse consequences). This probability may be ‘objective but a human judge has to decide how to make use of it in making a decision.

5.2.3. Efficiency Algorithms enable decisions to be made much faster than they could be made by human judgment alone. However, Gavaghan et al. (2019) emphasise that this does not merely allow cost-cutting. Routine cases are dealt with more rapidly but this means that human experts can spend more time on difficult borderline cases less suitable for algorithmic treatment – and, indeed, this may increase their job satisfaction.

5.2.4. Transparency Traditional (non-advanced) algorithms are transparent: it is clear to those who have used and been affected by them why they have produced a particular outcome. Except in cases where it may allow an offender to ‘game’ the system, transparency is seen to be an ethically important part of the justice system. Unfortunately, as we have seen, advanced algorithmic tools, especially those based on machine learning, are rarely transparent. They are complex and it is difficult for their users (or their designers) to provide reasons why they have come to particular decisions.

5.2.5. Fairness In advanced algorithms based on machine learning and allied approaches, system are trained on historical data sets. These data sets may, for example, comprise records of experts’ judgments on many cases. However, if those experts have exhibited biases in the past with respect to any demographic or other variable, those biases will be encapsulated in the resulting algorithm and promulgated anew each time it is used. This may be less of a problem if society’s views about what is fair and what is not remain unchanged over time. If, on the other hand, elements within society are becoming increasingly sensitive to biases, algorithms dependent on use of historical databases may be seen to be unfair. This will be especially so when those biases are seen unfounded and pernicious (i.e., prejudices).

5.3. Use of algorithms

Gavaghan et al. (2019) identify measures that could be taken to ensure that use of algorithms is effective, ethical, and legal.

5.3.1. Monitoring When a tool is automated, it is no longer directly under human control but it still needs to be monitored. Nuclear power stations are no longer actively controlled but they still need to be monitored. Advanced algorithms can be considered to be automated judgment systems. Most of the time they work well but, occasionally, they will come across a case that they do not deal with fairly or effectively. Thus introduction of advanced algorithms will mean that experts who previously made judgments will be required to monitor the judgments made by a machine. Skill sets required by supervision will not be exactly the same as those needed for operation. Experts may acquire new monitoring skills but gradually lose their operating skills (Bainbridge, 1983). Over time, they may become complacent and place too much trust in automated systems (Pazouki et al., 2018).

5.3.1.1. Gavaghan et al. (2019, p 40) draw attention to specifically legal problems associated with managing automation in the presence of this sort of complacency: ‘Power conferred upon a minister of the Crown, for example, not only must be exercised, it must be exercised by the minister. Public law principles effectively prohibit the delegation of statutory powers to third parties without express or implied authorisation in the decision-makers enabling legislation. Likewise, these principles inhibit the authorised decision-maker from “fettering” their discretion, for instance, by blindly following company “policy” or other organizational protocols. … Therefore, to ward off future legal challenges against the use of algorithmic tools that are at risk of inducing complacency, it may be necessary to obtain express statutory authorisation for the “delegation”’.

5.3.2. The right to explanations If governments are to be answerable and accountable to their citizens, they must be able to provide reasons for the decisions that they make. We have seen that, though we can say how an algorithm learns from a large historical data set, it is difficult to say why the resulting trained algorithm reaches a particular decision in a particular case. There are, however, developments within AI that aim to provide explanations for why a machine learning system makes decisions in the way that it does: a second system is trained to reproduce the performance of the first one; this second model can be much simpler than the original one and can be optimised to ensure the explanations that it provides are useful. However, this is work in progress (Edwards & Veale, 2017).

5.3.3. Dealing with bias When a system is trained on a data set comprising records of the way in which many experts have dealt in the past with many cases, there are two clear potential sources of bias. First, the set of historical cases may not be representative of the cases on which the algorithm will be used in the future: for example, Griffiths (2016) reports that an automatic face recognition system trained on Caucasian faces rejected passport application photos from Asian individuals (apparently because it treated them as if their eyes were closed). Second, the experts used in the training set should themselves be unbiased. Gavaghan et al. (2019) suggest that filtering using certain criteria can help to ensure that this is so. One possible criterion is ‘classification parity’: for example, a risk assessment algorithm designed to predict loan defaults should produce similar false negative rates for different demographic groups (Corbett-Davies & Goel, 2018).

5.3.4. Informational privacy Edwards & Veale (2017) point out that “Machine learning and big data analytics in general are fundamentally based around the idea of repurposing data, which is in principle contrary to the data protection principle that data should be collected for named and specific purposes”. Although data are often anonymised, there are concerns about re-identification (Ohm, 2010). There are still a number of unsettled issues. For example, systems can be used to make inferences from data that have been provided with consent; do people have any rights of access to data that has been inferred about them? Or should consent be obtained from someone before the system uses their data to make such inferences?

5.3.5. Legal issues Advanced algorithms can learn to do things by themselves (including making mistakes). Thus they may leave people open to dangers for which no person can be identified as responsible. One can argue that, if manufacturers are responsible for products they make, then software developers should be responsible for algorithms. But the principles of product liability do not apply to services and it is, apparently, not yet clear whether software is a product or a service. Gavaghan et al. (2019) consider whether recognising algorithms as juridical persons, somewhat akin to the earlier decision to restrict the liability of investors by recognising a company of people as a type of person, would be an acceptable alternative to recognizing that nobody is responsible for algorithmic errors (because “the machine did it”).

5.4. Regulating algorithms

Gavaghan et al. (2019) outline three general methods by which algorithms in the public sector could be regulated: through the establishment of a regulatory agency; through “hard law” (i.e., statutes passed by Parliament and decisions made in courts); and through self-regulation.

5.4.1. A regulatory agency Gavaghan et al. (2019) favour a regulatory agency. An ‘algorithms regulator’ would work with existing agencies to produce guidelines for the use of algorithms in the public sector, would monitor application of those algorithms, and, possibly, would maintain a register of their use.

5.4.1.1. Current examples of such regulators in other jurisdictions are few, but are increasing annually. Some have been set up specifically to deal with the ethics of algorithm (or more specifically, artificial intelligence) use. They include Singapore’s Advisory Council on the Ethical Use of AI and Data. Where specific regulators have not been established, general AI ‘strategies’ have been published. In the UK there already exists an Office for Artificial Intelligence (OAI), but this does not have a general focus on algorithms, and rather exists principally to promote the adoption of AI in the public sector.

5.4.1.2. One could envisage a future algorithm regulatory agency taking on the functions of the OAI and the recently established Centre for Data Ethics and Innovation (CDEI), along with other duties.

5.4.2. Rights through statute Regulation through statute law would still serve purpose alongside a regulatory agency. Statute law allows for the establishment of individual rights with regards to algorithm use. Such provisions may include rights not dissimilar to those offered by the European Union’s General Data Protection Regulation (GDPR) and now incorporated into UK law. Most notable among the rights included in the GDPR is the one specified in Article 22: “The data subject shall have the right not to be subject to a decision based solely on automated processing”.

5.4.3. A combination of techniques Ultimately it seems likely that a combination of regulatory devices would function best to ensure that the use of algorithms in the law, and the public sector in general, is accurate, objective, efficient, transparent, and fair.

6. Conclusion

We have submitted evidence on previous work concerned with the use of advanced algorithmic tools in legal contexts. We have omitted mention of algorithm aversion, people’s reluctance to use algorithms rather than their own judgment (e.g., Dietvorst et al, 2015). Though relevant to use of algorithms in legal contexts (e.g., sentencing guidelines), it has been concerned with relatively simple algorithms rather than the advanced algorithmic tools that have been our focus here.

 

4 September 2021

 

7. References

Bainbridge, L. (1983). Ironies of automation. Automatica, 19, 775-779.

Cavadino, P. & Gibson, P. (1993). Bail, The Law, Best Practice, and the Debate. Winchester: Waterside Press.

Dhami, M. K. & Ayton, P. (2001). Bailing and jailing the fast and frugal way. Journal of Behavioral Decision Making, 14, 141-168.

Dietvorst, B. J., Simmons, J. P. & Massey, C. (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144, 114-126.

Edwards, L. & Veale, M. (2017). Slave to the algorithm? Why a “right to an explanation” is probably not the remedy you are looking for. Duke Law and Technology Review, 16, 18-84.

Griffiths, J. (2016). New Zealand passport robot thinks this Asian man’s eyes are closed. CNN.com December 9, 2016.

Larsen, J., Mattu, S., Kirchner, L. & Angwin, J. (2016). How we analysed the COMPASS recidivism algorithm. ProPublica.org May 23, 2016.

Ohm, P. (2010). Broken promises of privacy: Responding to the surprised failure of anonymization. UCLA Law Review, 57, 1701-1777.

Pazouki, k., Forbes, N., Norman, R. A. & Woodward, M. D. (2018). Investigation on the impact of human-automation interaction in maritime operations. Ocean Engineering, 153, 297-304.