Written Evidence Submitted by Loughborough University

(GAI0070)

 

SUMMARY

This document draws on several research projects at Loughborough University, focusing on the application of Artificial Intelligence (AI) to cultural heritage. We make recommendations for incorporating collaborative action and user-based assessments in the regulation of AI. In order to improve transparency and accountability, the development of explainable AI (XAI) needs to be encouraged. Ensuring transparency and reducing bias need to be central to AI governance. But without access to data, it is impossible to train AI systems. Building on the National Data Strategy, we therefore recommend making data more accessible and usable by a wide range of users – including researchers in the Humanities and Social Sciences. 

 

Problems Addressed:

  1. Transparency in the application of AI.
  2. Lack of accessibility regarding public records.
  3. Inherent biases within AI development and their real-world implications.

Recommendations:

  1. Instituting standards for mitigating AI bias and improving transparency.
  2. Fostering collaborative and international approaches to the regulation of AI across all sectors and involving users.
  3. Encouraging the development of sensitivity review for unlocking data.
  4. Implementing comprehensive and internationally recognised certificates for developers and implementors of AI tools.
  5. Improving copyright legislation to unlock data for AI development.
  6. Making a distinction between AI and explainable AI (XAI) in legislation.

WHO WE ARE

We are writing as a team of academics from Loughborough University with expertise in the multi-disciplinary applications of AI. This report has been written by:

Dr Lise Jaillant, Senior Lecturer in Digital Humanities and current Principal Investigator on several AHRC-funded Digital Humanities projects: AEOLIAN (Artificial Intelligence for Cultural Organisations); EyCon (Early Conflict Photography and AI); and LUSTRE (Unlocking our Digital Past with Artificial Intelligence). Dr Jaillant has written and edited articles and collections on AI in the cultural heritage sector, including Archives, Access, and Artificial Intelligence (2022), and special issues on born-digital archives with AI & Society (2022) and Archival Science (2022).

Dr Katherine Aske, a Postdoctoral Research Associate for the Digital Humanities projects AEOLIAN and EyCon, addressing the uses of AI across the cultural heritage sector.

With contributions from:

Dr Karen Blay, Senior Lecturer in Digital Construction and Quantity Surveying and currently a co-investigator on a RAAC project employing Artificial Intelligence (computer vision, image recognition) to mitigate socio-technical issues during a RAAC condition survey funded by the NHS. Blay recently led a project on Information Resilience, funded by the Centre for Digital Built Britain (CDBB), Cambridge, which identified and modelled the socio-technical requirements and capabilities, and informed standards for optimising data and information from AI, Internet of Things (IoT), big data and collaborative processes in a digital environment. 

Dr Maribel Hidalgo-Urbaneja, a Postdoctoral Research Associate for the Digital Humanities project LUSTRE, working with the cabinet office to address the potential uses of AI in government records.

Dr Martin Maguire, Lecturer in the School of Design and Creative Arts. His background is in computing and ergonomics and his main teaching and research interest is in human-computer interaction. He has worked on several EU projects to develop Human Factors tools, methods and guidelines for user-centred design. His is interested is in the design and ethics of AI systems and systems that are understandable and acceptable to end-users.

Prof. Sergey Saveliev, Professor of Theoretical Physics and Principal Investigator of a major EPSRC project “Neuromorphic memristive circuits to simulate inhibitory and excitatory dynamics of neuron networks: from physiological similarities to deep learning.

 

How effective is current governance of AI in the UK?

  1. Private companies such as Amazon, Google and Facebook are furthering AI-driven technological advances at an astonishing pace. In contrast to the private sector, the use of AI in the public sector is still behind, and the governance of AI (i.e., the regulation of AI uses and applications) is not reacting to technological advances quickly enough.
  2. An example where the need for good governance is apparent is Facebook selling personal data to Cambridge Analytica. In its investigation, the ICO found that Facebook had breached data protection laws by failing to keep users’ personal information secure, allowing Cambridge Analytica to harvest the data of up to 87 million people worldwide, without their consent (BBC, 2020). With Cambridge Analytica setting precedents for unethical use of personal data, there have been calls for Britain to abolish data monopolies and improve the regulation of AI (House of Lords Artificial Intelligence Select Committee, 2018). The risks presented by AI technologies therefore need to be balanced against the many advantages that can be gained from developing AI applications, including the improvement of public services; making information more accessible (Jaillant, 2022a/b; Jaillant & Caputo, 2022; Berryhill et al., 2019); and even strengthening management and policy-making (Noordt & Misuraca, 2022; Höchtl et al., 2016). But without effective governance, institutions, particularly those in the public sector, will be unable to make informed decisions on AI usage.
  3. In our research into the uses of AI in the cultural heritage sector, we have found that many organisations, which would benefit from the implementation of AI technologies to assist with information management, are restricted by inappropriate legislation. Archival institutions in particular are struggling to preserve and make accessible the records of our recent history, which mostly exist in born-digital format, such as emails, PDFs, reports and other digital filesincluding government records that contain important public information. The sheer volume of these records makes it extremely complicated to search, categorise or review them manually, particularly when they are in different formats and spread across multiple devices and systems. Without government legislation on the uses of AI to automate record processing, there are only data protection laws and in-house policies to limit the risks of accidentally making confidential or sensitive materials accessible to the public. As a result, many of these documents remain unprocessed as fears of breaching data protection laws are met with further challenges in staffing, workloads, and financial limitations. Existing AI tools have the potential to make such data more accessible, for example by automatically identifying sensitive information and making non-sensitive data accessible to users. This process, known as sensitivity review, could potentially unlock a huge number of records for research purposes – by historians, social scientists, journalists and third-sector professionals. However, without effective policies to facilitate the use of automated sensitivity review, many of these born-digital records will remain inaccessible.
  4. The inaccessibility of public information, and the negative societal impact caused by withholding important data, are major concerns outlined in the UK’s National Data Strategy (2020). To address these concerns, the effective governance of AI will require policymakers, and those implementing policy, to engage with AI algorithms (i.e., the methods by which AI is trained), by defining, structuring, and recommending standards for the processing of data, rather than treating AI as an impenetrable "black box”.

 

What measures could make the use of AI more transparent and explainable to the public?

  1. The European Commission (EC) has attempted to tackle the issue of transparency by balancing technological advances with risk mitigation measures (2021; see also Sioli, 2021). According to the EC report, the definition of AI should be as broad as possible to cover future developments. It also suggests that policies should include transparency obligations for certain AI systems. For example, humans should be notified when interacting with such systems (Art. 52) and voluntary codes of conduct should be implemented for low-risk situations (Art. 69). Similar categorised approaches could help to address issues of transparency, and implement the aims outlined in the UK governments Roadmap to an Effective AI Assurance Ecosystem (2021). According to this roadmap, “AI assurance will be crucial to building an agile, world-leading approach to AI governance in the UK.
  2. Existing policies, such as the data protection laws, are not enough to cover the ever-evolving AI technologies and the risks to the public that need to be mitigated, particularly when AI makes invisible decisions that impact users. Future measures need to include more consideration for public users and ensure that they understand the implications of AI algorithms and have ways to challenge AI-driven decisions that could unduly affect their lives. An increasing body of research demonstrates that users tend to trust algorithms only when their rationale has been made clear. They want to know what the algorithm is doing and why, and to be assured that the results are as unaffected as possible by biases (Shin & Park, 2019). If users perceive the results of algorithmic methods to be unfair or unethical, they are unlikely to support the use of such systems, even if they are technically more accurate than human processing of the same data (Kieslich et al., 2022). This tendency was demonstrated by the controversy about UK A-Level results in the summer of 2020. Such was the public discontent about the use of algorithmic methods to allocate grades that, under massive political pressure, the government reverted to teacher-allocated grades (Kolkman, 2020). Therefore, the regulations need to demonstrate extensive risk assessments, to identify what AI is capable of, where the risks and biases can be found, and how the decisions made by AI are subsequently employed. We suggest that a distinction between the use of AI and explainable AI (XAI i.e., AI tools that show their decision-making) within government legislation could be a key to ensuring the public are more aware of what, why, and how, AI technologies are being used (Rai, 2020; Linardatos et al. 2020; Markus et al. 2021; Meske et al. 2022).

 

How should decisions involving AI be reviewed and scrutinised in both public and private sectors?

  1. Improving the UK’s approach to the governance of AI can only be achieved through collaboration across sectors, including public and private sector professionals, as well as users of AI-driven technologies (Lauterback, 2019). We suggest that users should be at the forefront of decisions involving AI, especially where personal data is concerned. Under current data protection law, individuals have a number of rights relating to their personal data. Within AI, these rights apply wherever personal data is used at any point in the development and deployment lifecycle of an AI system. However, if users are directly impacted by the results of AI decisions that they do not understand, then current UK regulations will not be effective at anticipating and addressing the risks and challenges brought about by AI technologies. Both public and private sectors will need to be able to evaluate these potential issues in their implementation of AI technologies.
  2. We recommend that decisions regarding the reviewing and scrutiny of AI within both public and private sectors need to be underpinned by internationally recognised certificates or licenses clearly explaining:
  1. It is important to allow institutions and sectors to take charge of their implementation and review of AI technologies. A UK legal framework of this nature will simultaneously allow more freedom for development and innovation, particularly in the public sector, while increasing transparency and addressing issues of bias as well as risk management. It would also allow for increased regulations regarding privacy issues and data laundering (i.e., the illegal fabrication of stolen data so that it may be used for lawful purposes), which span across international boundaries.

 

How should the use of AI be regulated, and which body or bodies should provide regulatory oversight?

  1. The ongoing research project led by Lise Jaillant that investigates the role of AI in the cultural heritage sector and government archives (LUSTRE), has confirmed that information management professionals, computer scientists, scholars and other experts see a need for making AI more transparent and trustworthy through government regulations. The project consulted experts who proposed solutions in terms of regulatory actions and legal frameworks, such as licensing. However, alongside the challenges in developing AI technologies, it was clear that there are also ethical challenges that are not being addressed by current government legislation, such as data bias and data protection.
  2. One solution is to improve the licensing of AI software. Although licenses for free or open-source software have made such tools more trustable, licenses must also be able to protect citizens against data laundering or privacy breaches. Responsible AI Licences, or RAILs, are being developed by a number of actors in Europe (Muñox Ferrandis, 2022). RAILs are in line with the AI Act, the proposed European law on AI. As the findings of the LUSTRE project demonstrate, the AI Act has been identified as a piece of legislation that once implemented, will change the AI regulation landscape. The UK government and current Digital Regulation Cooperation Forum (DRCF) can follow similar initiatives to help safeguard the future of AI innovation by aligning policies with other, international regulatory bodies.

 

To what extent is the legal framework for the use of AI, especially in making decisions, fit for purpose? Is more legislation or better guidance required?

  1. There is currently no overall legal framework for the use of AI. While some aspects are covered, there are many risks in the uses of AI that existing regulations do not address. More legislation and guidance, as well as thorough risk assessments of what is expected from AI, are required.
  2. Current UK legislation policy plans suggest that there must be a more granular approach to AI regulations than the EU’s AI Act currently proposes. Approaching AI through its uses, rather than defining the remits of the technologies themselves, are a way forward, but could equally cause a lack of consistency. Moreover, definitions of AI uses may offer policy documents flexibility for innovation, but they do not necessarily allow for the regulation of how technologies are trained, and what is done with the results of AI algorithms, which are the areas where the most serious risks are present.
  3. For example, our research through the AEOLIAN and EyCon projects has found that the availability of datasets (i.e., curated collections of information used to train AI algorithms, such as images of faces or statistical data) drastically impacts the outcome of AI algorithms, and often negatively affects marginalised groups. This issue has implications in the cultural sector (for example when AI is trained using problematic historical metadata containing racist language). Such results, as Jaillant and others have found, are often caused by a lack of diversity in the datasets that are used to train AI algorithms (Manjavacas & Fonteyn, 2022). This issue needs to be addressed by AI governance. Particularly for technologies trained on images, such as in healthcare and security sectors, the availability of diverse datasets is often lacking, leading to an overreliance on ‘available’ data, which often perpetuates social inequalities regarding race and gender, and therefore affects the reliability of such tools in real-life practice. To avoid this issue, datasets used to train technologies that will affect public users should be carefully prepared and regulated by an appropriate legal framework, so that results and applications can be properly evaluated as fit-for-purpose.
  4. One solution is to address the inappropriate copyright legislation for AI, which is hindering access to data and therefore the training of AI systems. Many applications of machine learning depend on copyrighted data (Sobel, 2021). The Intellectual Property Office’s press release, ‘Artificial Intelligence and IP: Copyright and Patents’ has proposed a new copyright and database exception, meaning that “anyone with lawful access to material protected by copyright” should be able to use data analysis technologies (e.g., text and data mining) “without further permissions” (2022). Although steps must be taken to ensure that existing copyrights are not infringed, our research has shown that improving access to digital data to train AI will encourage technological innovation, while also increasing transparency and reducing biases caused by inadequate access to data.
  5. Regarding better guidance, we recommend that considerations be made for public users, as users currently have no existing power or role in AI governance. As our research has shown, the end-user is often considered in the development of AI technologies but can be quickly forgotten upon their implementation. More must be done to assess how a diverse range of users could be affected by AI technologies, to pre-empt where they might be facing unseen and unacknowledged challenges. However, it should also be acknowledged that AI systems are inherently complex and so regulation is also likely to be complex. The people assessing a system to check legal compliance will therefore need appropriate knowledge and training, and would benefit from the sharing of workflows and documented case studies.

 

What lessons, if any, can the UK learn from other countries on AI governance?

  1. The EU’s AI Act aims to regulate AI in Europe. However, recent criticisms have highlighted the lack of consumer and user rights outlined in the act, and the ineffective terminology surrounding accountability, considering that most AI technologies are not produced by a single organisation (Edwards, 2022). The US’s National Artificial Intelligence (AI) Initiative outlines many of the same core principles as the UK’s and EU’s, suggesting regulations for innovation, trustworthiness and improvements to existing infrastructure.
  2. Issues of trustworthiness, transparency and innovation through AI are of global concern. In recent interviews for the AEOLIAN project, digital humanists, computer scientists and archivists have all stressed the necessity for cross-sectoral collaboration in the development and implementation of AI technologies and regulations within the cultural heritage and archival sectors. National policies for the governance of AI should, ideally, be compatible with other global policies, as digital technologies, by their very nature, cross international borders. Open-source AI tools, for example, can be developed and employed anywhere in the world, and so regulations regarding accountability and ownership become impossible to enforce without international co-operation. We recommend that regulatory policies need to focus on the intended uses of AI technologies, international licensing, and the impact of these technologies on the user.

 

Works Cited

Berryhill, Jamie, Kévin Kok Heang, Rob Clogher, and Keegan McBride, ‘Hello, World: Artificial Intelligence and its use in the Public Sector’, OECD Working Papers on Public Governance, 36 (2019). Available online: https://www.oecd.org/governance/innovative-government/working-paper-hello-world-artificial-intelligence-and-its-use-in-the-public-sector.htm [accessed 22 November 2022].

Burke, Mary, Oksana. L. Zavalina, Shobhana L. Chelliah, and Mark E. Phillips, User Needs in Language Archives: Findings from Interviews with Language Archive Managers, Depositors, and End-Users, Language Documentation and Conservation, 16 (2022): 124.

Butcher, James, and Irakli Beridze, ‘What is the State of Artificial Intelligence Governance Globally?’, The RUSI Journal, 164 (2019): 88–96. DOI: 10.1080/03071847.2019.1694260.

Criddle, Christina, ‘Facebook sued over Cambridge Analytica data scandal’, BBC News, 28 October 2020. Available online: https://www.bbc.co.uk/news/technology-54722362 [accessed 24 November 2022].

Dafoe, Allan, AI Governance: A Research Agenda, Centre for the Governance of AI Program, Future of Humanity Institute (Oxford: University of Oxford, 2018).

Daly, Angela, et al., Artificial Intelligence Governance and Ethics: Global Perspectives’, Social Science Research Network, University of Hong Kong Faculty of Law Research Paper No. 2019/033 (2019).

Edwards, Lilian, ‘The EU AI Act: A Summary of its Significance and Scope’, Ada Lovelace Institute, 11 April 2022. Available online: https://www.adalovelaceinstitute.org/wp-

content/uploads/2022/04/Expert-explainer-The-EU-AI-Act-11-April-2022.pdf [accessed 22 November 2022].

European Commission, Fostering a European approach to Artificial Intelligence, COM(2021) 205. Available online: https://ec.europa.eu/transparency/documents-register/detail?ref=COM(2021)205&lang=en [accessed 22 November 2022].

House of Lords Select Committee on Artificial Intelligence, ‘AI in the UK: Ready, Willing and Able?’, Report of Session 2017–19 (2018). Available online: https://publications.parliament.uk/pa/ld201719/ldselect/ldai/100/100.pdf [accessed 22 November 2022].

Höchtl, Johann, Peter Parycek, and Ralph Schöllhammer, ‘Big data in the Policy Cycle: Policy Decision Making in the Digital Era’, Journal of Organizational Computing and Electronic Commerce, 26, 1–2 (2016): 147–69. DOI: 10.1080/10919392.2015.1125187.

Intellectual Property Office, ‘Artificial Intelligence and IP: Copyright and Patents’, Press Release, 28 June 2022. Available online: https://www.gov.uk/government/news/artificial-intelligence-and-ip-copyright-and-patents [accessed 22 November 2022].

Jaillant, Lise, ed. Archives, Access and Artificial Intelligence (Bielefeld: Transcript Verlag, 2022a).

Jaillant, Lise, ‘How Can We Make Born-Digital and Digitised Archives More Accessible? Identifying Obstacles and Solutions’, Archival Science, 22 (2022b): 41736. DOI: 10.1007/s10502-022-09390-7.

Jaillant, Lise, and Annalina Caputo, ‘Unlocking Digital Archives: Cross-Disciplinary Perspectives on AI and Born-Digital Data’, AI & Society, 37 (2022): 82335. DOI: 10.1007/s00146-021-01367-x.

Kieslich, Kimon, Birte Keller, and Christopher Starke, ‘Artificial Intelligence Ethics by Design. Evaluating Public Perception on the Importance of Ethical Design Principles of Artificial Intelligence’, Big Data & Society, 9, 1 (2022). DOI: 10.1177/20539517221092956.

Kolkman, Daan, ‘“F**k the Algorithm”?: What the World Can Learn from the UK’s A-Level Grading Fiasco’, Impact of Social Sciences Blog, 26 August 2020. Available online: https://blogs.lse.ac.uk/impactofsocialsciences/2020/08/26/fk-the-algorithm-what-the-world-can-learn-from-the-uks-a-level-grading-fiasco/ [accessed 20 November 2022].

Lauterbach, Anastassia, ‘Artificial Intelligence and Policy: quo vadis?’ Digital Policy, Regulation and Governance, 21, 3 (2019): 23863. DOI: 10.1108/DPRG-09-2018-0054.

Linardatos, Pantelis, Vasilis Papastefanopoulos, and Sotiris Kotsiantis, ‘Explainable AI: A Review of Machine Learning Interpretability Methods’, Entropy (Basel), 23, 1 (2021): 18. DOI: 10.3390/e23010018.

Manjavacas, Enrique, and Lauren Fonteyn, ‘Adapting vs. Pre-Training Language Models for Historical Languages’, Journal of Data Mining & Digital Humanities (2022): 1–19. DOI: 10.46298/jdmdh.9152.

Markus, Aniek F., Jan A. Kors, and Peter R. Rijnbeek, ‘The Role of Explainability in Creating Trustworthy Artificial Intelligence for Health Care: A Comprehensive Survey of the Terminology, Design Choices, and Evaluation Strategies’, Journal of Biomedical Informatics, 113 (2021): 1–11. DOI: 10.1016/j.jbi.2020.103655.

Meske, Christian, Enrico Bunde, Johannes Schneider, and Martin Gersch, ‘Explainable Artificial Intelligence: Objectives, Stakeholders, and Future Research Opportunities, Information Systems Management’ , Issue 1: Business Intelligence & Big Data for Innovative and Sustainable Development of Organizations, 39, 1 (2022): 53–63. DOI: 10.1080/10580530.2020.1849465.

Muñox Ferrandis, Carlos, ‘Responsible AI licenses: a practical tool for implementing the OECD Principles for Trustworthy AI’, OECD (2022). Available online: https://oecd.ai/en/wonk/rails-licenses-trustworthy-ai [accessed 22 November 2022].

Noordt, Colin van, and Gianluca Misuraca, ‘Artificial intelligence for the public sector: results of landscaping the use of AI in government across the European Union’, Government Information Quarterly, 39, 3 (2022): 101714. DOI: 10.1016/j.giq.2022.101714

Rai, Arun, ‘Explainable AI: from Black Box to Glass Box’, Journal of the Academy of Marketing Science, 48, 1 (2020): 137–41. DOI: 10.1007/s11747-019-00710-5.

Shin, Donghee, and Yong Jin Park, ‘Role of Fairness, Accountability, and Transparency in Algorithmic Affordance’, Computers in Human Behavior, 98 (2019): 277–84. DOI: 10.1016/j.chb.2019.04.019.

Sioli, Lucilla, ‘A European Approach to Artificial Intelligence’, online presentation, 23 April 2021. Available online: https://www.ceps.eu/wp-content/uploads/2021/04/AI-Presentation-CEPS-Webinar-L.-Sioli-23.4.21.pdf [accessed 22 November 2022]

Sobel, Benjamin, ‘A Taxonomy of Training Data: Disentangling the Mismatched Rights, Remedies, and Rationales for Restricting Machine Learning’, Artificial Intelligence & Intellectual Property, ed. by Jyh-An Lee, Reto Hilty and Kung-Chung Liu (Oxford: Oxford University Press, 2021), pp. 221–42.

 

 

(November 2022)