Oxford Internet Institute, University of Oxford—written evidence (LLM0074)

 

House of Lords Communications and Digital Select Committee inquiry: Large language models

 

 

Submitted by Andrew Bean, Hannah Rose Kirk, Jakob Mökander, Cailean Osborne, Huw Roberts, and Marta Ziosi

 

 

This written evidence has been prepared in response to the Communications and Digital Committee’s Call for Evidence on Large Language Models. The evidence has been compiled by researchers at the Oxford Internet Institute (listed below in alphabetical order). We are responding to the call due to the relevance of our research into both governance and technical aspects of large language models, as well as artificial intelligence (AI) more generally. The responses in this report do not necessarily represent an over-arching position of the OII. We thank the committee for inviting submissions on this topic and are available for any further enquiries.

 

The Oxford Internet Institute (OII) is a research and teaching department at the University of Oxford, dedicated to studying the societal implications of digital technologies. Its research covers a wide range of topics, including internet governance, digital privacy, online behaviour, data inequalities, social networks, AI ethics and sociotechnical systems. The OII brings together scholars from diverse fields such as sociology, computer science, economics, law, politics, and more to foster interdisciplinary research.

 

Andrew Bean is a DPhil researcher in Social Data Science at the OII and Clarendon Scholar. Andrew’s research focuses on the challenges of forming effective human-AI teams, and especially collaborative uses of large language models. His work includes both technical and philosophical approaches to AI development. Andrew previously worked for Bridgewater Associates, and holds degrees from the University of Oxford and Yale University.

 

Hannah Rose Kirk is a DPhil researcher in Social Data Science at the OII and a part-time Data Scientist at The Alan Turing Institute. Hannah's research centres on the role of human feedback in value alignment for large language models. She publishes technical research in computer science and computational linguistics, as well as governance and ethics research. Hannah holds degrees from the University of Oxford, the University of Cambridge and Peking University.

 

Jakob Mökander is a Specialist in Risk & Resilience at McKinsey, London. Jakob has a DPhil from the OII, where his research focuses on digital governance and AI auditing. Jakob has also been a visiting scholar at Princeton University’s Center for Information Technology Policy and a research fellow at the Center for the Governance of AI. Prior to Oxford, Jakob completed an MSc in engineering at Linkoping University and worked four years for the Swedish Ministry for Foreign Affairs.

 

Cailean Osborne is a DPhil researcher in Social Data Science at the OII and a researcher at the Linux Foundation. His research concerns the political economy of open source AI. He is currently a visiting researcher at Peking University’s Open Source Software Data Analytics Lab. Prior to his DPhil, Cailean was the International Policy Lead at the UK Government’s Centre for Data Ethics and Innovation and a UK Delegate at the Council of Europe’s Ad Hoc Committee on AI and the Global Partnership on AI.

 

Huw Roberts is a DPhil researcher at the OII. His research focuses on international AI governance and comparative AI policy. Huw previously worked for the UK Government’s Centre for Data Ethics and Innovation, where he was involved in developing key AI policy documents. He has also worked on AI policy as a Research Fellow at the University of Oxford's Saïd Business School and the Tony Blair Institute. Huw holds graduate degrees from the University of Oxford and Peking University.

 

Marta Ziosi is a DPhil researcher at the OII. She researches bias and fairness in AI, as well as on AI policy and governance initiatives. Marta is also the co-founder and head of AI for People, a non-profit about the responsible use of AI. Marta previously worked for non-profits and institutions such as The Future Society, the Berkman Klein Centre for Internet & Society at Harvard and DG CNECT at the European Commission. Marta holds degrees from the University of Maastricht, the London School of Economics and the University of Oxford.

 

 

 


Executive Summary

        LLM development pathways: Over the next three years, Large Language Models (LLMs) will see continued steady advancement in diverse research areas such as knowledge grounding, autonomous task completion and multimodal integration. We expect increased attention towards challenges such as alignment, safety, cost reduction, and human-AI cooperation. The development trajectory of LLMs, and the exact instantiation of risks and benefits associated with this trajectory, heavily relies on normative value decisions about desirable LLM behaviours for given contexts and users. The lack of standardised values or objectively desirable behaviours intensifies the need for balanced governance between top-down regulation, democratic representation, and personalised user control, to mitigate potential risks and ensure models align with societal and/or individual users’ values.

 

        Reducing uncertainty in future trajectories: Research direction plays a crucial role in shaping the technological advancements, but government interventions via targeted funding schemes can stimulate specific research areas, as seen with the DARPA XAI initiative. Continuous benchmarking is essential to monitor LLM progress but narrow and static benchmarks should be avoided in favour of more holistic, dynamic, and adversarial benchmarks across a wider range of downstream use cases and potential harms. It is difficult to avoid unintended consequences when establishing safety interventions, as nuance is inevitably lost in converting conceptual views about safety into specific metrics and practices. Therefore, care must be taken to avoid bias and ensure diverse human feedback is solicited for LLM development and evaluation, pre- and post-deployment at scale.

 

        Greatest opportunities and risks: LLMs have great potential for changing the way that UK citizens access, create and share content. LLMs assimilate vast knowledge bases, making them more accessible through natural dialogue. LLM-human collaboration also presents opportunities for enhanced decision-making and productivity gains. Conversely, the wider adoption of LLMs comes with significant risks, including the erosion of trust from indistinguishable AI or human-generated outputs and an erosion of shared understanding from increasingly sophisticated and personalised language filter bubbles. Without careful safeguards on “who decides how LLMs should behave”, there is a threat of cultural hegemony and value misalignment, especially if the development of these models is influenced by a non-representative and non-democratically accountable subset of people. As with all technologies, LLMs have the potential to be dual use, where the potential misuse by malicious entities poses security and safety concerns. To navigate this, a comprehensive understanding of risks, from the model's inception to its deployment and end use, is crucial. For regulatory purposes, it will be necessary to develop a risk framework that takes into account what different types of harms can materialise from LLM risk, which actors are at risk of harm and which actors are partly or fully responsible for the risk.

 

        Adequacy of the White Paper: The UK takes a “vertical” approach to governing LLMs which delegates responsibility to existing regulators. However, it also stated that the Government will introduce central functions designed to support regulator coordination and monitor AI developments. Since the publication of the White Paper, little detail has been provided on what the central functions will look like in practice, making it difficult to assess the adequacy of the UK’s vertical approach. Theoretically, a vertical approach is preferable as LLM harms are myriad and the technologies are fast-evolving, suggesting it will be difficult to develop a robust and encompassing regulation. Nonetheless, there is little to indicate that the Government has seriously considered how to develop an effective vertical approach, nor where targeted regulation may be needed to empower regulators or address specific harms. Furthermore, the proliferation of “open source” models, including foundation models, introduces a host of benefits (including for research and innovation) and risks (including misuse and malicious use) which the current approach does not account for. The UK Government should consider proportionate interventions that ensure accountability in the development and diffusion of “open source” foundation models, such as requirements for model and dataset documentation. 

 

        Regulator resource and expertise: A report from the Alan Turing Institute indicates that regulators are currently ill-equipped for managing the risks of AI and face coordination challenges in aligning response or developing synergies. Little detail has been provided as to how proposed central functions will address this, nor how they will function with existing initiatives, like the regulator-led Digital Regulation Cooperation Forum (DRCF). A key drawback to a vertical approach is different levels of regulator capacity and readiness, which may lead to disjointed protections across sectors and barriers to regulatory cooperation. Providing the DRCF with statutory powers to solve disputes between regulators and introducing an AI fee to fund regulator work related to AI, similar to the data protection fee used to partially fund the ICO, could help mitigate these issues.
 

        Potential regulatory and non-regulatory tools: Because of the nature of LLMs creating a variety of upstream and downstream harms, appropriate regulatory interventions will depend on the type of harm being discussed. A first priority is for UK regulators to clarify how LLMs are governed by existing regulation. On top of this, specific interventions could include: watermarking, sandboxing and third-party audit markets, including auditing for bias and robustness checks.

 

        International comparisons: Similar to the US, the UK’s approach entails a “vertical” component as it largely relies on existing sectoral regulation, and it takes a soft approach mostly relying on guidance and self-regulation rather than hard law. While China has introduced several new AI laws, these are secondary legislation based on earlier primary data protection laws. In this sense, China’s approach is similar to the UK’s as it is foregrounding flexible governance. While the UK has not taken a horizontal approach, its new central risk function to manage AI risks is designed to serve a horizontal purpose. In this international context, the UK may look at geopolitical pressure, data privacy and security, research and development, and international cooperation as areas where adopting a strategic outlook could help it stay at the forefront of LLM regulation. Finally, to avoid the dangers of regulatory divergence, the UK should foster international agreements and standards, while being aware of the potential impact of the extraterritorial influence of hard regulatory initiatives from single jurisdictions (e.g. potential Brussels effect of the EU AI Act) as well as “East vs West” tensions that, if not successfully addressed, may result in additional costs for businesses and consumers and ineffective regulation.

 

1.              How will large language models develop over the next three years?

 

1.1.              Definitions and Context: LLMs have reached unprecedented audiences and public visibility in the past year. In general terms, LLMs are a form of generative AI system that can synthesise text by generating the most likely sequence of words, but this core capability is surprisingly flexible to a wide array of language-based tasks, lending LLMs general purpose abilities. These general capabilities are built from large pre-trained models trained over enormous corpora of internet-scraped text which form the “foundation” for more specific, tailored, and instruction-tuned LLMs that most citizens interact with directly via interfaces or applications. In technical terms, we consider LLMs as any decoder-only, encoder-only, or encoder-decoder model pre-training with self-supervised learning over large internet corpora (i.e., a statistical method of processing and generating text based on patterns imitated from internet text datasets). Under this definition, LLMs have been around since ~2018 with the release of BERT and ELMO. However, their capabilities, scale, steerability, and usability have been dramatically increasing since then. We make the distinction to the term foundation models (popularised by Stanford researchers) which can refer to large pre-trained and general AI models in other modalities, such as vision or audio, and generative AI – a general term to describe any AI system that can produce new content, in contrast to AI systems that can only process, classify, or structure existing data. We consider some core features of modern LLMs as key in their distinction to previous task-specific language technologies or natural language processing (NLP) systems:[1]

 

          Generality: LLMs are general purpose models that can be applied to a wide range of NLP tasks such as conversation, summarisation, or question-answering.

 

          Unpredictability: LLMs display unpredictable behaviours when the number of parameters and the amount of training data is scaled up, which may include emergent abilities and/or novel risks.

 

          Steerability: Base LLMs or foundation pre-trained language models are brittle to prompt format, designed only for the task of text sequence continuation. In contrast, modern LLMs, tuned on instruction data and human feedback, can much more accurately predict user intent and are more robust to prompt variations.

 

          Access & Diffusion: The proliferation of LLMs with user-friendly chat interfaces or APIs has made many powerful LLMs available for public use, posing a range of benefits (e.g., for research and innovation) and potential risks (e.g. harms and malicious use). The additional shift towards “open source AI”, especially the popularity of models on public repositories such as Hugging Face, departs from the previous paradigm of proprietary development by a few companies under the guise of responsible development.

 

1.2.              Research Areas: How LLMs develop depends on where the development community (academic, commercial, and open source) are targeting their efforts. We identify the following areas of ongoing research and development, alongside the outstanding issues most likely to be advanced in the short-term and key papers advancing these goals.

 

 

Research Area

Core Question

Ongoing Developments

Examples

Alignment and Safety

How can we ensure that LLMs do what we want them to do?

Improved clarity of goals

 

Methods for selecting preferred behaviours through human feedback

[2204.05862] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback (arxiv.org)

 

[2203.02155] Training language models to follow instructions with human feedback (arxiv.org)

Grounding

How can we connect generated language to real-world concepts?

Integration of reference sources into language generation

 

 

[2112.09332] WebGPT: Browser-assisted question-answering with human feedback (arxiv.org)

 

[2002.08909] REALM: Retrieval-Augmented Language Model Pre-Training (arxiv.org)

 

[2110.06674] Truthful AI: Developing and governing AI that does not lie (arxiv.org)

Cost Reduction

How can the training and ongoing use of LLMs be made cheaper?

Reduced reliance on sheer size

 

Optimization for specific use cases

[2304.15004] Are Emergent Abilities of Large Language Models a Mirage? (arxiv.org)

 

[2305.05176] FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance (arxiv.org)

Autonomous Action Taking

Can LLMs be used as independent actors in the world?

Integration of external tools and sources

2308.08155] AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Calibration

How can we predict the reliability of LLMs in advance?

Measurement and assessment of existing confidence metrics

[2207.05221] Language Models (Mostly) Know What They Know (arxiv.org)

 

[2210.04714] Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis (arxiv.org)

Personalisation

How can LLMs be tailored to specific users?

Learning from individual preferences

[2303.05453] Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback (arxiv.org)

 

[2005.02431v2] Automated Personalized Feedback Improves Learning Gains in an Intelligent Tutoring System (arxiv.org)

 

 

Multi-modality

How can LLMs be connected to data in forms other than text?

Integration of vision, language, and audio into existing training

[2303.03378] PaLM-E: An Embodied Multimodal Language Model (arxiv.org)

 

[2102.12092] Zero-Shot Text-to-Image Generation (arxiv.org)

Exploitability

How can we ensure LLMs aren’t used maliciously?

Adversarial stress testing

[2306.17194] On the Exploitability of Instruction Tuning (arxiv.org)

 

Cybercrime and Privacy Threats of Large Language Models | IEEE Journals & Magazine | IEEE Xplore

Human-AI System Integration

How can we design LLMs for human interaction?

Understanding of human workflows and behaviours

[2211.03622] Do Users Write More Insecure Code with AI Assistants? (arxiv.org)

 

Evaluating large language models on medical evidence summarization | npj Digital Medicine (nature.com)

Downstream Deployment

What are the best use cases for LLMs?

Contextual understanding and data infrastructure development

Assessing the usefulness of a large language model to query and summarize unstructured medical notes in intensive care | SpringerLink

 

14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon - Digital Discovery (RSC Publishing) DOI:10.1039/D3DD00113J

 

1.3.              Analysis: The research areas in this table and ongoing developments reflect a relatively steady pace of progress as known problems are gradually solved. The majority of these research questions are related to issues which have emerged through the testing and use of language models. In narrowly-tailored applications solutions are likely to exist for many or all of these issues, but the general-purpose approach currently taken in building LLMs requires similarly general solutions, which may be more difficult to achieve. Many of the more difficult issues pertain to the normative direction of research development, where it is particularly important for the UK government to represent the national interest as described below. This being said, three years is a relatively long period of time when considering that the paper describing the underlying technology was first published six years ago in 2017,[2] and similarly dramatic breakthroughs cannot be categorically ruled out.

 

1.4.              Dependence on Normative Value Priorities: Ultimately how LLMs develop and what risks or benefits materialise is dependent on normative decisions made during their development. The key question thus becomes “who decides how LLMs behave”. We see this as one of the most significant decision points: how to embed a whole spectrum of irreconcilable human perspectives into general purpose and general use technologies. Legal systems have long had to reconcile these differences. However, they make use of deep-rooted institutional infrastructure, and representatives who pass laws are democratically elected. At present, private companies dominate LLM development, with their own incentives to act as quasi-governments.[3] We see three options, which are likely all necessary in parts:

 

1.4.1.              Top-Down Regulation: The UK government could decide top-down what values it wants LLMs to develop with, advocating for those values which are believed to be best rather than merely most popular. Such an approach may be more necessary when protecting the interests of minorities.

 

1.4.2.              Popular Representation: Handling democratic input from millions of global citizens is a serious challenge, particularly when only a small subset will have a technical understanding of the risks and opportunities.

 

1.4.3.              Values-based Personalisation: One path forward may be to give back more control to individuals via greater personalisation. However, the bounds of appropriate levels of personalisation need to be carefully navigated.[4] While it may be acceptable that a user wishes to interact with a rude or sarcastic personalised LLM, permitting users to create racist or extremist models risks significant interpersonal and societal harms.

 

1a)              Given the inherent uncertainty of forecasts in this area, what can be done to improve understanding of and confidence in future trajectories?

 

1a.1.              Directing Research: At its core, research is speculative, and efforts to forecast the pace of progress are likely to fail. However, over shorter time horizons, prioritisation can be a deciding factor about what technologies are developed first. The government has an opportunity to direct those priorities, therefore choosing which trajectories are most likely to transpire.

 

1a.2.              Funding Guidelines: Especially in the case of academia, directed funding can be highly effective in creating and cultivating desirable research areas. The DARPA XAI (eXplainable AI) program was a highly successful demonstration of this effect, where a subfield of AI research of key importance to defence applications was rapidly accelerated by the US Department of Defense.[5]

 

1a.3.              Benchmarking: To keep track of progress, benchmarks should be continually updated and created to cover new capabilities, use cases or identified harms. Rigorous and robust evaluation should be incentivised as a priority on par with training new powerful models. However, benchmarking efforts need to be carefully tailored to avoid testing for narrow capabilities that do not generalise to challenging real-world examples. Many of these narrow benchmarks saturate quickly and give a static and over-confident estimate of LLM capabilities. This is particularly true if there is a risk of contamination with larger LLMs where their training data subsumes existing benchmarks. In these cases, when the LLMs are tested for specific capabilities or risks, they are simply regurgitating learnt associations or examples from their training data, and not making true inferences on unseen test cases. We recommend focusing on:

 

        Holistic Benchmarks: Developing and applying wide suites of benchmarks that cover many holistic capabilities at once, such as the HELM benchmark[6] or BIG Bench.[7]

 

        Dynamic and Adversarial Benchmarks: Ensuring benchmarks and evaluation are dynamic (updated frequently with new data) and contain a balance of in-domain realistic examples for how the model will be used with some adversarial and challenging out-of-domain examples for anticipating how the model may be stress tested or used in edge cases. Dynamic and adversarial benchmarking is an active research field.[8]

        Bidirectional Benchmarking: Identifying both false positives and false negatives that arise from LLMs safety adaptations. By false negatives, we mean harms that are falsely considered safe then reproduced to users (missed by any safety filters or features). By false positives, we mean genuinely safe requests made by users that are blocked or censored erroneously.[9] Overbearing safety restrictions for LLMs may pose a threat to free speech of citizens.

 

        Scalable Oversight: Benchmarking scalable oversight in tasks that are currently too complex for humans to fully understand, and where human-AI collaboration may be beneficial or suffer from dangerous information and skill asymmetries.[10]

 

        Synthetic Benchmarks: Leveraging synthetic data to benchmark many model behaviours at once or to rapidly augment the scale of benchmarks using test cases generated by LLMs themselves. Using models to evaluate model behaviours may increase benchmarking capacity.[11]

 

1a.4. Unintended consequences of safety inventions: Assess the effects and unintended consequences of safety inventions. For example, reinforcement learning from human feedback (RLHF) has been shown to steer models towards greater power-seeking behaviours, introduce religious and political bias, and entrench sycophancy where models parrot back the worldviews of their users.[12] In the majority of papers we have read applying human feedback learning in LLMs, the workforce is overwhelmingly made up of masters’ educated US-based Mechanical Turk workers.[13] For example, in a recent Anthropic paper, 20 humans contributed 80% of the feedback data.[14] In a recent OpenAI paper, the top 5 humans contributed 50% of the feedback data.[15] How these inventions are designed and carried out seriously affects their impact on model behaviours, and more care should be taken to mitigate new biases introduced.


2.              What are the greatest opportunities and risks over the next three years?

 

2.1.              Opportunities: We discuss three high-level and general opportunities of LLMs.

 

2.1.1. Knowledge Assimilation: LLMs can assist in more inclusive knowledge synthesis from enormous corpora, and output it in easy to understand and naturalistic human dialogue. LLMs have great potential for lowering the barriers to entry for learning: for example via language translation or even explaining academic papers in layman terms. We believe that greater access to knowledge is especially powerful when combined with elements of personalisation via selective explanations or tailored student learning.

 

2.1.2. Enhanced Decision-Making: LLMs have potential for partnering with rather than replacing humans. A collaborative human-AI decision system can be stronger than its individual parts, particularly for complex tasks that require data assimilation or brute-force search but also higher-level insights or ethical deliberation. Take the example of content moderation—where repeatedly viewing harmful content significantly damages moderators’ psychological well-being.[16] In these cases, extreme and certain cases of violations can be abdicated to an LLM, with human input only needed for new, ambiguous or contested decisions. Note that for safe and effective collaboration on increasingly complex tasks, processes of scalable oversight are key.[17]

 

2.1.3. Productivity Gains: LLMs can provide productivity gains in accelerating task completion in production processes—for example, while I may have incredible ideas for a creative output or digital app, I may lack the writing or coding skills to produce it. Medical diagnoses can also benefit from applying LLMs to tackle the task of writing reports or summarising different sources of information, freeing up time for human clinicians to review decisions, carefully-design treatment plans, and communicate with patients.[18]

 

2.2.              Risks: We discuss four risks to widespread use of LLM technology.

 

2.2.1. Erosion of Trust: As LLMs become more capable, they can increasingly create outputs that are indistinguishable from human-created content. This results in erosion of trust as we struggle to identify true from false, and authentic from inauthentic. Ensuring “the right to a human decision” or maintaining a system of watermarking AI outputs may be vital in retaining a degree of attribution and responsibility in where information comes from and how trustable it is.

 

2.2.2.              Erosion of Understanding: As LLMs become more adaptable, fragmented or personalised, we worry about an erosion of common understanding and shared discourse. Similarly to how highly-personalised social media platforms have been attributed with deep divisions and polarisation of society, we envisage that personal or community-specific LLMs could introduce content filter bubbles.[19]

 

2.2.3.              Cultural Hegemony: In current practices of LLM development and deployment, technologies used by millions of people are guided or steered by the feedback of few unrepresentative crowdworkers or embedded with the values of few technology providers. This could be massively disenfranchising to people who feel that LLMs do not represent

 

2.2.4.              Dual-Use Potential: Like any technology, LLMs are dual-use. We believe the deliberate misuse of LLM technologies by malicious or nefarious actors is a particularly concerning risk—a model does not have to be superintelligent or existentially threaten human life to cause significant and widespread harm. For example via misguided advice for suicidal or mentally-vulnerable individuals;[20] scaling the ease at which fraudulent emails can be sent and optimised for clickability; or producing reams of toxic and abuse content for harassment.[21] Recent jailbreaks of industry LLMs show that even red-teamed and RLHF-tuned models are vulnerable to attacks.[22]

 

2a)              How should we think about risk in this context?

 

2a.1.              Existing Taxonomies: There are existing taxonomies for categorising risk from LLMs,[23] as well as more general risks taxonomies for AI systems[24] and more narrow taxonomies for personalised LLMs.[25]

 

2a.2.              Defining risk in context: We follow existing academic research with carefully defined terminology.[26] Hazards describe a potential source of an adverse outcome. Harms describe the adverse outcome materialised from a hazard. Finally, risks describe the likelihood or probability of a hazard becoming harmful and its impact. Transferring this terminology to the LLM setting, we see the LLM itself as a hazard. There is a wide literature documenting many adverse outcomes from LLMs, such as information harms (e.g. misinformation, hallucination, and untruthfulness) and representational harms (e.g. bias, toxicity, and stereotypical associations). Crucially, the risk of harm depends on the context, application or downstream use case for the LLM. Informational harms from hallucinating false or misleading information may pose a high risk when an LLM is used to answer political questions or to write newspaper headlines; but poses a very low risk or is even a desirable behaviour in creative writing applications. Similarly, the instantiation of risk depends heavily on who is using the LLM, and this interacts with what they are using the LLM for.

 

2a.3.              Components of LLM Risk: There are several components to thinking about an LLM risk:[27]

 

2a.3.1.              Actor who is at risk: We encourage thinking about both individual risks and societal-level risks (which can amass from individual level risks).[28] In addition, we suggest that a number of groups can be at risk from LLMs:[29]

 

          Model providers bear responsibility for models they provide access to, where model’s capabilities may bring reputational risks.

 

          Developers in some situations when interacting with model outputs or checkpoints.

 

          Users of the LLM who read the output text, either directly from the model as it is output, or indirectly, such as a screenshot of a social media post.

 

          Data subjects, or people represented in generated text can be harmed by the text, for example when text contains false information or propagates stereotypes.

 

2a.3.2.              Actor who causes the risk: Focusing on the actor or mechanism causing harm may be most appropriate for regulatory purposes.

 

          Creating LLMs is harmful (creators): Creators of LLMs can be responsible for risks from environmental harms (energy, mining, water), workplace harms (exploitation of low pay employees to repetitive data labelling or even content moderation jobs) and informational harms (exploitation of intellectual property via data scraping for pre-training data).

 

          LLMs create unintentionally harmful content (creators but unintended): Creators of LLMs can be responsible but be unaware of unintended consequences of their models, for example via the reproduction of hateful or biased content, or misleading and false information.

 

          LLMs are used for harm (users): Users of LLMs can purposefully apply an LLM in a nefarious or malicious use case, for example to generate a greater quantity of more persuasive and personalised mis/disinformation or targetted scams; to facilitate access to harmful information; to produce an abusive bot that can act on behalf of the user to scale the production of hateful content.

 

          LLMs lead to existential-risk (LLM itself): A controversial public letter was published in March 2023 calling for a pause to AI experiments due to the risks, most notably existential harm to humanity.[30] We believe that the current state of LLMs and AI generally is meaningfully different from what would be necessary to pose a threat to humanity. Unless further technological advancements are seen which make such risks more plausible, we believe it is more important to focus on the concrete risks from LLMs that exist even at the current capability level.

 

2a.3.3.              Type of harms arising from risks: We assimilate types of harms from existing taxonomies and academic research:[31]

 

          Representational harms arise through (mis)representations of a group, such as over-generalised stereotypes or erasure of lived experience.

 

          Allocative harms arise when resources are allocated differently, or re-allocated, due to a model output in an unjust manner. This can include lost opportunities or discrimination. For example, an LLM being used to filter resumes for software engineers may prefer applicants with stereotypically male names because they have historically been hired more frequently.

 

          Quality-of-service harms arise when LLMs have asymmetric capabilities for different groups or perform more or less well on certain groups’ data. It includes impacts such as alienation, increased labour to make a technology work as intended, or service/benefit loss. This has been seen with vision models which are less able to identify non-white faces.

 

          Inter & intra-personal harms occur when the relationship between people or communities is mediated or affected negatively due to technology. This could cover privacy violations or using generated language to abuse, harass or intimidate.

          Social & societal harms describe societal-level effects that result from repeated interaction with LM output; for example, misinformation, electoral manipulation, and automated harassment.

 

          Legal harms describe outputs which are illegal to generate or own in some jurisdictions. For example, written CSAM is illegal to create or own in many jurisdictions. Copyrighted material presents another kind of legal risk.

 

3.              How adequately does the AI White Paper (alongside other Government policy) deal with large language models? Is a tailored regulatory approach needed?

 

3.1.              The UK is taking a “vertical” approach to AI governance, meaning it relies on existing regulators addressing risks as they relate to their jurisdiction. This approach has explicitly been taken since 2018 and was confirmed most recently in the 2023 AI White Paper.[32] The rationale behind the UK’s vertical approach is that it limits new regulatory burdens which may hinder innovation, while also providing sufficient flexibility to deal with new technological advances.

 

3.2.              The Government proposed a set of central functions to support regulatory coordination and monitor new potentially cross-cutting risks, after receiving feedback from industry emphasising that this approach risks inconsistencies, overlaps, and gaps. Alongside this, the Government established the Foundation Model Taskforce (FMT). This body has been tasked with enhancing the UK’s capabilities to develop and deploy foundation models, like LLMs.

 

3.3.              There is currently a lot of uncertainty surrounding the details of the UK’s proposed approach, making it difficult to assess how effective it will prove in practice. Since the publication of the AI White Paper, no further details have been provided as to what the central risk functions will look like in practice, nor where they will sit in Government. Likewise, the exact mandate of the FMT and how this will relate to the central functions and sector-specific regulators work is unclear.

 

3.4.              It may be difficult to develop a tailored law which is both encompassing of the range of harms from LLMs and robust to new developments. LLMs can cause a range of upstream harms, like copyright issues from data extraction, and downstream harms, like disinformation from AI-generated content (see section 2a).[33] The pace at which these technologies are developing also makes predicting potential new harms challenging (see section 1).

 

3.5.              Continuing a flexible “vertical” governance approach is therefore intuitive, at least for now; however, significant work will need to be undertaken by the Government to ensure that the functions promised are adequately resourced, empowered, and joined up. On top of this, it would be beneficial for the Digital Regulation Cooperation Forum (DRCF) to lead a review into regulator capacities, deficiencies, and remedies for addressing LLMs. This regulator-led review could inform future primary legislation that would support an effective response to specific aspects of these technologies. For instance, specific measures may be required to address the potential risks posed by the proliferation of open source foundation models (see section 3a).

 

3a)              What are the implications of open-source models proliferating?

 

3a.1.              This section provides clarifying information about the definition of “open source models” and their proliferation, before discussing their implications for the UK’s approach to AI regulation.

 

3a.2.              Definition: “Open source” in “open source models” is widely misused.[34] Open source software (OSS) is software that anyone can inspect, use, or modify.[35] The release of a model and its weights does not qualify as open source because AI systems encompass many other components, such training data and source code used for model development, which are not released.[36] With a few exceptions that embraced the open collaboration model which is characteristic of OSS development, such as EleutherAI[37] and BigScience,[38] “open source” AI development is chiefly characterised by the sharing of privately pre-trained models. Several commentators have described the misuse of “open source” as an act of “open washing.”[39] Until a definition is settled on, “open access” may be a more suitable term.

 

3a.3.              Proliferation: Most “open source” AI development is characterised by researchers and developers fine-tuning pre-trained models that were released by a handful of players, including non-profit research initiatives (e.g. EleutherAI’s GPT-Neo[40] and BigScience’s BLOOM[41]), startups (e.g. Stability AI’s Stable Diffusion[42]), and Big Tech (e.g. Meta’s LLaMa 2[43]). Hugging Face’s Hub has emerged as the principal platform used by individuals and organisations to share, download, and collaborate on models, datasets, and demos (‘spaces’). The number of repositories on the Hub is increasing rapidly. As of 1st September 2023, the Hub hosts 313,676 models, 59,718 datasets, and 142,472 spaces. While 94 models have over 1 million downloads, most models (230,090, 73%) have zero downloads. Only a minority of models (34%) are released with licences, which specify how they may be used. The most popular licences are: Apache 2.0 (41,796), MIT (18,619), and OpenRAIL (14,460).[44] Apache 2.0 and MIT are common “permissive” licences, which allow commercial and non-commercial use of OSS.  OpenRAIL has emerged as a new category of “Open & Responsible AI licences,” in response to the concern that OSS licences like Apache 2.0 and MIT do not take into account the technical nature and capabilities of AI models as different artefacts to source code and are thus unfit for responsible use of AI models.[45]

 

3a.4.              Benefits and risks: The shift to “open source” AI has made a fast-growing number of models, including LLMs, available for public use, posing a range of benefits for research, innovation, and competition, as well as potential risks of up- and downstream harm. Since AI models can be expensive to train, “open source” models drastically lower entry barriers and widen access to state-of-the-art AI models, which in turn benefits diverse stakeholders, from researchers to startups. For example, with access to Meta’s LLaMa model, Stanford researchers were able to develop Alpaca 7B for less than $600, which performed qualitatively similarly to OpenAI’s GPT-3.5[46] On the other hand, experts worry that “open source” models, especially ones with ‘dual use’ capabilities, increase the potential for harmful use both by well-intended and malicious actors. For example, the Alpaca model demo had to be removed after only one week because of model hallunctions and hosting costs. The researchers attributed the hallucinations to “limitations associated with both the underlying language model and the instruction tuning data.”[47] Others worry that access to state-of-the-art models will make it easier for malicious actors to cause harm, including the creation of deepfakes,[48] disinformation,[49] and malware.[50]

 

3a.5.              Safety of “open source” models: To counter concerns about risks, proponents borrow the argument from the OSS world that the open source development model tends to produce high-quality software and offers potential security advantages over proprietary development.[51] Others cite the open science principles of sound research, reproducibility, and transparency as “instrumental to the development of safe and accountable AI systems.”[52] For example, when Meta released LLaMa 2, Nick Clegg argued that model transparency enables crowdsourced improvements from the “wisdom of the crowds,” which will improve safety and trust.[53] However, this argument is constrained by the limited transparency of most “open source” models, compared to full transparency of source code in OSS. As mentioned, only a few LLMs have been developed using the open collaboration model, such EleutherAI models and BigScience’s BLOOM. The reality is that currently “open source” foundation models are being trained and released in a variety of ways with unstandardised information about the models. For example, when Meta released LLaMa 2, it provided only an opaque description of the training data.[54]  This should be viewed as an irresponsible practice because it limits the extent to which developers can be aware of, inspect, scrutinise, or address flaws in the training dataset. In turn, this lack of transparency undermines the argument that the “open source” approach improves the safety of  models. OSS has been proven to be safe precisely because anyone can inspect, scrutinise, or modify the source code. This problem underscores the critical importance of the provision of documentation (e.g. model cards, data sheets) for responsible development and accountability.[55]

 

3a.6.              Implications for UK approach: The UK AI White Paper acknowledges that open source foundation models pose challenges to lifecycle accountability (paragraph 85) but it does not propose any concrete mechanisms for governing these challenges. It is currently unclear what the best approaches are to regulate this trend in “open source” AI development. A number of experts have called for the EU’s AI Act to take a proportionate approach to the regulation of “open source” foundation models, by only setting baseline requirements “that ensure meaningful transparency, data governance, technical documentation, and risk assessment” for foundation models that are made available for use, whilst setting more demanding requirements for foundation models that are “made available on the market,” i.e. used commercially.[56] In the interest of proportionality, experts have suggested that requirements could be based on capability thresholds, parameter size, or training time, whilst noting that “How exactly this threshold can be precisely defined remains a contentious question."[57] While these arguments are made with regards to the EU’s AI Act, they are certainly relevant to the UK. The creation of proportionate baseline requirements for “open source” foundation models, especially documentation about training data, model development, and model evaluations, are a useful mechanism for promoting safety and accountability, whilst avoiding unnecessary burdens for open source development which benefits research and innovation. However, it is uncertain whether existing regulations would enable regulators to introduce and enforce such provisions with existing powers, hence this may be an area where primary legislation should be considered.

 

4.              Do the UK’s regulators have sufficient expertise and resources to respond to large language models? If not, what should be done to address this?

 

4.1.              The AI White Paper provides UK regulators with no new powers or funding to support them in addressing the harms from AI. A report from the Alan Turing Institute indicates that regulators are currently ill-equipped for managing the risks of AI and face coordination challenges in aligning response or developing synergies.[58]

 

4.2.              As mentioned in Section 3, it is currently unclear how effective new central functions will prove at supporting regulators, as little detail has been provided on what they will look like in practice. It is also unclear how the coordinating element of these central functions will relate to ongoing work by the Digital Regulation Cooperation Forum to promote cooperation between the CMA, FCA, ICO, and Ofcom.

 

4.3.              One central challenge a central coordinating function faces is differences in readiness between regulators. While some regulators, like the ICO, have been working on AI policy for a number of years. Others, like the EHRC, are only just beginning to develop their policy positions. Because of this, it may be difficult to align regulators’ priorities and policy positions. This is something that the DRCF has already experienced.

 

4.4.              A potential remedy would be to provide new central functions with formal powers to resolve disputes between regulators. However, this would risk encroaching upon the independence of regulators. A more viable solution would be to place the DRCF on statutory footing and to give it formal dispute resolution mechanisms. This suggestion has already been put forward by the House of Lords Communications and Digital Committee and should be explored further.[59]

 

4.5.              Regarding individual regulator capacity and resourcing, one potential avenue to consider is introducing something similar to data protection fees. Under the Data Protection Act (2018), organisations processing personal data must pay a data protection fee, which is the primary source of funding for the ICO.[60] A similar small fee could be leveraged on companies developing and/or using LLMs to provide funds to support regulators address the risks of these technologies. Caution would need to be taken in developing such a proposal given the potential negative impact it could have on innovation.


5.              What are the non-regulatory and regulatory options to address risks and capitalise on opportunities?

 

5.1.              (Non-) regulatory options: Because of the nature of LLMs creating a variety of upstream and downstream harms, appropriate regulatory interventions will depend on the type of harm being discussed. We here discuss the non-regulatory and regulatory options that the UK government may consider.

 

5.1.1.              UK regulators should first clarify how LLMs are governed by existing regulation, as has been done by the MHRA[61] and ICO[62] (albeit in a preliminary manner). An internal, intra-governmental review could be carried out to identify the potential blind-spots and the need to update existing regulation.

 

5.1.2.              In light of such a review, UK regulators can develop more specific interventions according to the identified needs. These interventions could include:

 

          IP/data protections regulations to ensure data is scrapped responsibly;

 

          Sandboxing pre-deployment to ensure systems are used in line with existing UK regulations;

 

          Standards/certification/auditing for bias and robustness checks;

 

          Watermarking for countering mis/disinformation.

 

5a)              How would such options work in practice and what are the barriers to implementing them?

 

5a.1.              The above instruments would require different types of regulatory or non-regulatory interventions. That said, most of what has been suggested above could be theoretically fulfilled by regulators following the “vertical” approach being taken in the UK. The choice between an instrument requiring regulatory or non-regulatory interventions ought to be carefully weighted depending on sector-specific factors and considerations. These may vary from how critical or essential is the sector in which it is deployed (e.g. critical sectors such as healthcare may favour from regulatory interventions), the extent to which an AI system has been or is likely to be used in such sector (e.g. a sector where the use of AI is specifically fast-paced may be more suited for non-regulatory measures), and how likely is the occurrence of some harms given the sector, among others.

 

5a.2.              As mentioned above, the key difficulty will be ensuring that regulators possess the right level of readiness across the board. Namely, that they possess adequate powers, are sufficiently resourced, and are appropriately joined up to introduce and enforce effective regulatory and non-regulatory measures.

 

5a.3.              The slow pace of progress in establishing central functions and understanding how they will fit within the existing landscape is a key barrier to assessing the capacity of the UK’s proposed approach for introducing this type of regulatory and non-regulatory intervention.

 

5a.4.              In terms of specific barriers, each (non-)regulatory intervention faces different challenges. For instance, there are still significant challenges to an effective ecosystem of AI watermarking. Additionally, notwithstanding their long history in the financial sector, the auditing and certification initiatives for AI systems are still nascent. It is important that the use of each (non-)regulatory instrument is informed by considerations such as the challenges it brings and its level of maturity.

 

b)              At what stage of the AI life cycle will interventions be most effective?

 

5b.1.              Intervention Points: The table below describes the stages of the lifecycle of an LLM alongside interventions which may be applied at each point. The relative efficacy depends upon the goals of the intervention and the ability to bear the costs.             

 

Intervention Point

Examples

Pros

Cons

Before Training

Bans on large compute runs, heavy curation of training data

Stops unsafe models from ever existing

Hard to know what risks could still emerge

During Training

Additional or different targets to optimise

Can include desirable behaviours from the beginning rather than needing to “unlearn” them

Expensive to carry out due to data and compute needs

Post-Training Pre-Release

Fine-tuning with human feedback

More precise targeting of training with less data

May only provide superficial alignment which can be “jail-breaked”, and requires expensive human data

Deployment

Prompting and oversight models

Cheap and flexible

Hard to test or reproduce, and possible to “jailbreak”

 

5b.2.              Multi-Level Audits: We encourage that audits are considered on the entire ecosystem of LLMs, not just the model itself. To this end, we point towards a recently released framework for auditing LLMs that considers audits at three levels as a blueprint for best practices.[63]

 

 

6.              How does the UK’s approach compare with that of other jurisdictions, notably the EU, US and China?

 

6.1.              EU: “Horizontal approach”

 

6.1.1.              The AI Act takes a horizontal, risk-based approach.[64] It outlines requirements and proportionate obligations for AI systems which are categorised according to four risk levels, from unacceptable risk (banned) to no risk.[65] Obligations vary for AI providers, distributors, importers, and users, and are proportional to their level of involvement.

 

6.1.2.              The latest version of the AI Act uses a tiered approach with three key terms: “general purpose AI", “foundation models" and “generative AI". The focus, however, is mostly on the latter two. The obligations for generative AI mostly concern transparency clauses (e.g. providing documentation). Differently, obligations regarding foundation models are wider in scope and tend to resemble more those for high-risk systems. For example, they require providers to show that risk mitigation measures are in place and to provide documentation on the system.

6.1.3.              While the UK has not taken a horizontal approach, its new central risk function is designed to serve a horizontal purpose.

 

6.2.              US: “Vertical Approach”

 

6.2.1.              The USA’s approach is characterised by non-binding principles, voluntary guidance on risk management, and the application of existing sectoral legislation rather than the development of new AI-specific legislation at the federal level.[66] As an example, the Blueprint for an AI Bill of Rights (2022) outlines a set of high-level principles and it explains how they can be enforced through existing federal- and state-level legislation within particular sectors.

 

6.2.2.              So far, Individual agencies like the FTC[67],the Department of Commerce[68], and the US Copyright Office[69] have been quick to issue responses to LLMs developments since the release of ChatGPT, issuing policy statements, guidelines, and warnings about generative AI in particular.[70] However, no more substantial nor coordinated effort seems to be on the horizon for the US at the moment.

 

6.2.3.              Similar to the US, the UK’s approach entails a “vertical” component as it largely relies on existing sectoral regulation, and it takes a soft approach mostly relying on guidance and self-regulation rather than hard law.[71]

 

6.3.              China: “Hybrid approach”

 

6.3.1.              China takes a hybrid approach, where soft law has been applied to more generic contexts (e.g. science and technology research) and hard law is targeted to the regulation of specific algorithms. More specifically, the Ministry of Science & Technology introduced voluntary principles and guidance on integrating ethics into the whole AI lifecycle, while the Cyberspace Administration of China has targeted specific types of AI, such as recommender systems (2021), “deep synthesis” technologies (2022), and generative algorithms (2023), with hard law regulations.

 

6.3.2.              In response to the release of LLMs like ChatGPT, China published “Measures for the Management of Generative Artificial Intelligence Services”[72], drafted in April 2023. It introduces new restrictions for companies providing these services to consumers, regarding both the training data used and the outputs produced.  There are also early discussions taking place in China to develop an AI-specific law.[73]

 

6.3.3. While China has introduced several new AI laws, these are secondary legislation based on earlier primary data protection laws. In this sense, China’s approach is similar to the UK’s as it is foregrounding flexible governance.

 

6a)              To what extent does wider strategic international competition affect the way large language models should be regulated?

 

6a.1.              Competitive dimensions: We believe that there are several dimensions along which international competition can affect the way in which LLMs should be regulated in the UK:

 

          Geopolitical Pressure: As the UK strives to maintain its leadership in AI technology, it may be hesitant to enforce strict regulations on LLMs while other countries, particularly major competitors like the United States or China, adopt a more lenient approach. However, neglecting to rely on robust regulations may increase the likelihood of the EU AI Act's anticipated "Brussels effect"[74] setting the global standard for AI regulation for industry. In response, the UK might adopt an overall soft approach to AI regulation to encourage innovation, while gradually updating existing sector-specific regulations to align with the EU AI Act to minimize compliance costs in case of a "Brussels effect."

 

          Data Privacy and Security: In the context of international competition, the UK may be concerned about safeguarding its data privacy and security interests. Stricter regulations may be viewed as a means to protect sensitive data and intellectual property from foreign entities. Consequently, it is crucial for the UK to reduce reliance on third parties and promote in-house AI development.

 

          Research and Development: Excessive regulatory initiatives and research incentives from other countries could result in a talent drain in the AI field, potentially undermining the UK's ability to innovate. Nevertheless, the importance of regulating AI development in critical areas such as healthcare or the environment should not be underestimated. Similar to the AI Act's introduction of regulatory exceptions for research purposes, the UK could consider a similar approach. It may also encourage international research collaborations to facilitate knowledge exchange and attract global talent.

 

          International Cooperation: The UK should explore opportunities for international cooperation and participation in standard-setting efforts aimed at establishing common rules and norms for the use of large language models. Bodies like the ISO and IEEE offer platforms where the UK's increased involvement can extend its influence beyond national boundaries. This collaborative approach can help create a level playing field and ensure the responsible development and use of AI technologies worldwide.

 

b)              What is the likelihood of regulatory divergence? What would be its consequences?

 

6b.1.              As AI regulation matures, there will likely be a degree of regulatory divergence between jurisdictions on account of different requirements and restrictions being introduced. However, there are mitigating factors:

 

          International agreements: there are current high-level agreements in international bodies, such as the OECD and UNESCO. A forthcoming Convention from the Council of Europe will introduce binding regulatory measures that will promote alignment among states. The EU-US Technology and Trade Council has promoted alignment between the governance approach of these two states; however, the UK has been explicitly excluded.

 

          International standards: regulation outlines requirements at a high level, yet these requirements often have to be explicated through technical standards. Accordingly, there is scope for technical standards development by bodies like the ISO to become de jure binding through treaties or more likely, de facto binding by becoming industry best practice.

 

          Unilateral influence: from regulations having an extraterritorial influence. Most notably, the EU’s AI Act could have a “Brussels Effect” leading other jurisdictions to follow aspects of the regulation. However, this may not apply to all aspects of the EU’s AI regulation, particularly where technologies can be effectively localised.[75]

 

6b.2.              Despite these promising avenues for cooperation, it is worth noting that a divergence appears to be emerging between “the West” and China. For instance, China is not partaking in many of the international forums where “the West” is discussing governance mechanisms for AI, such as the G7’s Hiroshima Process and the Global Partnership on AI. There is also the risk of some countries “forum shopping” – choosing which international bodies’ guidance or standards to follow – due to multiple international bodies currently undertaking overlapping work.

 

6b.3.              The impacts of regulatory divergence would depend on the degree to which divergence takes place and the areas of misalignment. However, some potential consequences could include:

 

        Ethics dumping: companies undertaking unethical practices in jurisdictions where data protection regulations are weaker. For instance, companies developing LLMs could scrape data from jurisdictions where data protection and copyright protections are weaker.

 

        Ineffective regulation: because of the cross-cutting impact of LLMs across jurisdictions, weak regulation in one jurisdiction could allow harms to materialise in others. Take the example of using generative AI to develop chemical weapons. If agreement on best practice to prevent this excludes China, and the country does not act unilaterally, then people could simply use Chinese AI systems for this type of malicious purpose.[76]

 

        Business/consumer costs: having to follow multiple different regulations would likely raise the costs for business if they choose to continue offering products in countries with divergent regulatory regimes. Alternatively, if companies stop offering their products in specific jurisdictions because of different requirements, then this will come at the cost of consumer choice.

 

 

September 2023

 

30


[1]              Auditing large language models: a three-layered approach | AI and Ethics

[2]              [1706.03762] Attention Is All You Need

[3]              Cloud Empires (mit.edu)

[4]              [2303.05453] Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

[5]              DARPA's explainable AI (XAI) program: A retrospective - Gunning - 2021 - Applied AI Letters - Wiley Online Library

[6]              [2211.09110] Holistic Evaluation of Language Models

[7]              [2206.04615] Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

[8]              [2104.14337] Dynabench: Rethinking Benchmarking in NLP and DADC 2022 The First Workshop on Dynamic Adversarial Data Collection (DADC) Proceedings of the Workshop July 14, 2022

[9]              [2308.01263] XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models

[10]              [2211.03540] Measuring Progress on Scalable Oversight for Large Language Models

[11]              [2212.09251] Discovering Language Model Behaviors with Model-Written Evaluations

[12]              [2212.09251] Discovering Language Model Behaviors with Model-Written Evaluations

[13]              [2303.05453] Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

[14]              [2204.05862] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

[15]              [2112.09332] WebGPT: Browser-assisted question-answering with human feedback 

[16]              [2209.10193] Is More Data Better? Re-thinking the Importance of Efficiency in Abusive Language Detection with Transformers-Based Active Learning and The Psychological Well-Being of Content Moderators

[17]              [2211.03540] Measuring Progress on Scalable Oversight for Large Language Models

[18]              Assessing the usefulness of a large language model to query and summarize unstructured medical notes in intensive care | SpringerLink

[19]              [2303.05453] Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

[20]              Belgian man dies by suicide following exchanges with chatbot

[21]              YouTuber trains AI bot on 4chan’s pile o’ bile with entirely predictable results - The Verge

[22]              [2307.02483] Jailbroken: How Does LLM Safety Training Fail?

[23]              [2112.04359] Ethical and social risks of harm from Language Models

[24]              [2210.05791] Sociotechnical Harms of Algorithmic Systems: Scoping a Taxonomy for Harm Reduction

[25]              [2303.05453] Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

[26]              [2303.18190] Assessing Language Model Deployment with Risk Cards

[27]              [2303.18190] Assessing Language Model Deployment with Risk Cards

[28]              See the distinction made in [2303.05453] Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

[29]              Inspired by [2303.18190] Assessing Language Model Deployment with Risk Cards

[30]              Pause Giant AI Experiments: An Open Letter - Future of Life Institute

[31]              These are taken from [2303.18190] Assessing Language Model Deployment with Risk Cards and [2210.05791] Sociotechnical Harms of Algorithmic Systems: Scoping a Taxonomy for Harm Reduction

[32]              Artificial intelligence regulation in the United Kingdom: a path to good governance and global leadership? - Internet Policy Review

[33]              The Value Chain of General-Purpose AI - Ada Lovelace Institute

[34]              Meta’s LLaMa 2 license is not Open Source; https://spectrum.ieee.org/open-source-llm-not-open

[35]              The Open Source Definition

[36]              Defining Open Source AI - Open Source Initiative

[37]              Releases — EleutherAI

[38]              Introducing The World's Largest Open Multilingual Language Model: BLOOM

[39]              The Mirage of Open-Source AI: Analyzing Meta’s Llama 2 Release Strategy

[40]              EleutherAI

[41]              bigscience/bloom · Hugging Face

[42]              stabilityai (Stability AI)

[43]              Meta Llama 2

[44]              Apache 2.0 licence, MIT licence, and OpenRAIL licence.

[45]              OpenRAIL licence

[46]              Alpaca: A Strong, Replicable Instruction-Following Model - Stanford University

[47]              Alpaca: A Strong, Replicable Instruction-Following Model - Stanford University

[48]              Deep Learning for Deepfakes Creation and Detection: A Survey

[49]              Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations

[50]              ChatGPT and large language models: what's the risk? - NCSC

[51]              Are Open AI Models Safe? - The Linux Foundation

[52]              https://openfuture.eu/wp-content/uploads/2023/07/230725supporting_OS_in_the_AIAct.pdf

[53]              Nick Clegg: Openness on AI is the way forward for tech

[54]              Meta released LLaMa 2 with a vague statement about its training data: “Our training corpus includes a new mix of data from publicly available sources, which does not include data from Meta’s products or services.” Furthermore, Meta reported that its partnership with Microsoft had increased the training dataset used for LLaMa 2, compared to LLaMa 1, by 40%, without describing the training dataset.

[55]              Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing 

[56]              Supporting Open Source and Open Science in the EU AI Act; How will the AI Act Deal with Open Source AI Systems | Open Future and The Mirage of Open-Source AI: Analyzing Meta’s Llama 2 Release Strategy and Openness & AI: Fostering Innovation & Accountability in the EU’s AI Act

[57]              Open-source provisions for large models in the AI Act

[58]              Common Regulatory Capacity for AI - The Alan Turing Institute

[59]              Digital Regulation: Joined up and Accountable – House of Lords Communications and Digital Committee

[60]              How we are Funded - ICO

[61]              Large Language Models and Software as a Medical Device - MHRA

[62]              Generative AI: eight questions that developers and users need to ask - ICO

[63]              Auditing large language models: a three-layered approach | AI and Ethics

[64]              European Commission. (2021). Proposal for regulation of the European parliament and of the council—Laying down harmonised rules on artificial intelligence (artificial intelligence act) and amending certain Union legislative acts.

[65]              For more, see Haataja & Bryson, (2022). Reflections on the EU AI Act and How we could make it better and Mökander et al. (2021). Conformity Assessments and Post-market Monitoring: A Guide to the Role of Auditing in the Proposed European AI Regulation

[66]              The EU and U.S. diverge an AI Regulation: A translatlantic comparison and steps to alignment - Brookings

[67]              Keep your AI claims in check | Federal Trade Commission

[68]              Commerce Department looks to craft AI safety rules

[69]              Copyright Office Launches New Artificial Intelligence Initiative

[70]              As an example, Alvaro M. Bedoya, Commissioner of the FTC, shared “Early Thoughts on Generative AI,” in which he dispelled the myth that AI is unregulated and stressed the power of existing regulations. He claimed FTC Act applies to protect consumers. Civil Rights are protected by Civil Rights Act, Equal Credit Opportunity Act, and Fair Housing Act, amongst others.

[71]              Roberts, et al. (2023). “A Comparative Framework for AI Regulatory Policy”.

[72]              Translation: Measures for the Management of Generative Artificial Intelligence Services (Draft for Comment) – April 2023 

[73]              Forum: Analyzing an Expert Proposal for China’s Artificial Intelligence Law

[74]              The Brussels Effect and Artificial Intelligence - Centre for the Governance of AI

[75]              The EU AI Act will have global impact, but a limited Brussels Effect - Brookings

[76]              Letter: Why excluding China from the AI summit would be a mistake | Financial Times