AGENCY—written evidence (LLM0028)

 

House of Lords Communications and Digital Select Committee inquiry: Large language models

 

 

A.              INTRODUCTION

 

AGENCY is a multidisciplinary research team of academics with expertise in computer science (natural language processing, cybersecurity, artificial intelligence, human-computer interaction), law, business, economics, social sciences and media studies. Members of AGENCY are academics at prestigious institutions such as Newcastle University, Durham University, University of Birmingham, King's College London, Royal Holloway University of London, and University of Surrey. UK Research and Innovation supports our research through the Strategic Priority Fund as part of the Protecting Citizens Online programme. Grant title: AGENCY: Assuring Citizen Agency in a World with Complex Online Harms. Grant reference: EP/W032481/2

 

This call for evidence of the future of large language models (LLM) and regulation coincides with the work, expertise, and concerns of the AGENCY project, which focuses on assuring citizen agency in a world with complex online harms. We refer to citizen agency as the ability for people and society to be empowered through technology and tools that provide them with a sense of control and security in that space. Thus, we propose that people and society should be at the forefront of regulation and that regulation should aim to premeditate, mitigate, and respond to complex online harms in a way that empowers people and balances that empowerment with societal concerns (such as public health, safety and security) while ensuring respect for principles such as freedom of expression. Our team possesses specialised expertise in LLM, law, emerging technologies, and their non-regulatory solutions. Accordingly, it is our responsibility to submit our response to this call for evidence, as we are well-positioned to make a useful contribution in this area.

 

This is a submission from AGENCY. Specifically, the following researchers contributed to the formulation of this response: Dr Shrikant Malviya, Rebecca Owens, Dr Jehana Copilah-Ali, Prof Karen Elliott, Prof Ben Farrand, Dr Cristina Neesham, Dr Lei Shi, Dr Vasilis Vlachokyriakos, Dr Stamos Katsigiannis and Prof Aad van Moorsel.

 

 

B.              CAPABILITIES AND TRENDS

 

  1. How will large language models develop over the next three years?

 

In the future, there is the potential for rapid and unexpected change in LLM, and we expect them to develop in the following ways:

 

  1. Increased proliferation of smaller, domain-specific models: The proliferation of smaller, domain-specific models represents a notable trend in generative AI. These models are designed to cater to specific tasks, industries, or domains, and they contrast with large, general-purpose models like GPT-3. They have several benefits, i.e tailored domain-specific expertise, faster training and deployment, reduced bias and ethical concerns, improved privacy and security, customisation and personalisation.

 

  1. Convergence of LLMs and Multimodal Interactions: The capabilities of LLMs will be expanded by enabling them to interact with users through various modalities beyond just text. These modalities include images, videos, audio, augmented reality (AR), virtual reality (VR), and even robotics. This expansion will enhance the depth and richness of user interactions with LLMs, making them more versatile and capable of understanding and generating content across different formats.

 

  1. Advancing LLM Performance through Reinforcement Learning and Human Feedback Enhancements: A key strategy to improve the capabilities and effectiveness of Large Language Models (LLMs) is incorporating reinforcement learning (RL) and human feedback (HF) enhancements. This is critical for enhancing users agency by addressing bias mitigation, contextual understanding, ethical responsiveness, and user-centricity issues.

 

  1. Multilingual Capabilities: LLM will become more proficient in handling multiple languages and understanding context and nuances across different languages.

 

  1. Self-improving LLMs: Drawing inspiration from the mechanisms of human learning, next-generation artificial intelligence systems may possess the ability to self-train, opening up new uses for LLMs.

 

  1. Fact-checking themselves: The current LLMs suffer from factual unreliability and static knowledge limitations of large language models. 'Hallucinations' are one of their critical issues, for example, they recommend books that do not exist and confidently forecast the weather for a given fictional city. The following requirements could be crucial before LLMs are used for widespread real-world deployment such as the ability to provide valid citations and references for the answers they provide. LLMs require plenty of improvement and innovation in this area to overcome the shortcomings of their unreliability and their stubborn tendency to provide inaccurate information confidently.


Future trends relating to complex online harms,

 

 

 

2.              How should we think about risk in this context?

 

In our opinion, risk should be considered through five key issues:

 

  1. Access: We should limit who has access to robust AI systems, structuring the proper protocols, duties, oversight, and incentives for them to act safely.

 

  1. Alignment: Ensuring that the AI system will act as intended in agreement with socialised human ethical sensibility (i.e., values and norms)

 

  1. Raw intellectual power: Grade the generative AI systems on raw intellectual/processing power, which depends on the level of sophistication of the algorithms and the scale of computing resources of datasets (source).

 

  1. Scope of actions: Point out the potential for harm in AI systems based on the scope of actions that can be indirect, for example, through human actions (misinformation, data privacy or cybersecurity risks) or directly through the system itself/other AI agents ( stereotypes, unfair discrimination, exclusionary norms, toxic language etc), such as intentionally training LLM biased against specific groups of people, or training them with specific misinformation (sources).

 

  1. Unlicensed Text Usage: One significant concern and potential legal problem is the unauthorised use of extensive text for training LLMs. Many websites, digitised books, and magazines prohibit such usage or allow reuse once the source is properly acknowledged or referenced. When a Large Language Model generates output containing verbatim text from sources, it typically does not meet these requirements, potentially leading to legal issues (sources).

 

 

C.              NON-REGULATORY AND REGULATORY OPTIONS

 

We are calling for a human-centred, responsible innovation approach towards developing and regulating LLMs.

 

 

From a regulatory standpoint,

 

 

 

 

Non-regulatory options

 

 

 

 

 

 

 

 

 

 

 

 

 

D.              DOMESTIC REGULATION

 

  1. How adequately does the AI White Paper and other Government policies deal with LLM, is a tailored regulatory approach needed?

 

We believe that the sector-specific regulation proposed by the UK Government White Paper AI Regulation: A Pro Regulation Approach is ill-equipped to protect users from the complex harms stemming from LLM for several reasons.

 

 

 

 

Therefore, to promote consumer trust in the technology, an integrated, cross-sectoral regulatory strategy is better suited for LLM as it would enable regulators to pool expertise and resources, promoting a more efficient approach to regulation.

 

 

  1. Do the UK’s regulators have sufficient expertise and resources to respond to large language models? If not, what should be done to address this?

 

Existing UK regulators do not have the required expertise, resources and powers to effectively comprehend the complex technical and ethical aspects of LLM and improve public trust in using AI models.

 

In particular, we would like to draw attention to the fact that:

 

 

 

 

 

 

E.              INTERNATIONAL CONTEXT

 

  1. How does the UK’s approach compare with that of other jurisdictions, notably the EU, the US and China?

 

In our opinion, the UK’s regulatory approach differs significantly from other jurisdictions as they opt for a more stringent approach to regulation. We call on the government to learn from these approaches and adopt certain regulatory features from other jurisdictions to ensure the creation of a human-centred, responsible innovation approach to regulation.

 

EU

 

 

 

US

 

China

 

 

4 September 2023

7