House of Lords Communications and Digital Select Committee inquiry: Large language models
Summary
The potential of LLMs and other advanced AI
Covering Committee Questions
1. How will large language models develop over the next three years?
a. Given the inherent uncertainty of forecasts in this area, what can be done to improve understanding of and confidence in future trajectories?
2. What are the greatest opportunities and risks over the next three years?
a. How should we think about risk in this context?
The UK’s regulatory approach
Covering Committee Questions
3. How adequately does the AI White Paper (alongside other Government policy) deal with large language models? Is a tailored regulatory approach needed?
a. What are the implications of open-source models proliferating?
4. Do the UK’s regulators have sufficient expertise and resources to respond to large language models? If not, what should be done to address this?
Regulatory and non-regulatory approaches to safety and risk mitigation
Covering Committee Questions
5. What are the non-regulatory and regulatory options to address risks and capitalise on opportunities?
a. How would such options work in practice and what are the barriers to implementing them?
b. At what stage of the AI life cycle will interventions be most effective?
c. How can the risk of unintended consequences be addressed?
International cooperation
Covering Committee Questions
6. How does the UK’s approach compare with that of other jurisdictions, notably the EU, US and China?
a. To what extent does wider strategic international competition affect the way large language models should be regulated?
b. What is the likelihood of regulatory divergence? What would be its consequences?
Summary
The recent wave of focus on AI has centred on the rapid progress of AI models, including LLMs. Research advances from the Google and Google DeepMind teams paved the way for this current wave of cutting-edge models. This includes the development of the Transformer architecture, which unlocked today's large language models (LLMs). Recent popular applications, such as Bard and ChatGPT are based on this technology, and have spurred a step change in public awareness of AI, allowing significant numbers of people to use and interact with advanced systems in a way which had not been possible until now. LLMs and the current generation of AI models are characterised by more ‘general’ capabilities than those of more narrow, non-generative systems like image recognition models - which are adept at a more limited set of tasks oriented towards one capability.
In addition to focusing on LLMs specifically, it is important to consider risks and opportunities of all current and future powerful AI models, and we do so throughout this written evidence. This includes “frontier models”. We expect their capabilities to far exceed those of the models we see today, along multiple dimensions. In general, while we expect to see more specialised models becoming more powerful, we should also prepare for future AI systems to be much more general purpose; to span many modalities; and to be able to engage in complex tasks that involve memory, planning, and the use of external tools (such as internet tools).
There is broad consensus amongst developers of ‘frontier’ AI models that capabilities are improving, and new ones are emerging. There’s also consensus that the next generations of more powerful frontier AI models could present increasingly significant opportunities across the economy, and give rise to legitimate concerns that they may present novel and serious risks.
Given this development trajectory, we believe it is important for the UK to develop a strong governance framework that can adapt to change. We are supportive of the UK’s approach to AI regulation, and believe a context-based approach encourages and empowers existing regulators to provide quicker responses to fast-moving technology, and avoids unintentionally slowing down innovation or societal responses to risks. However, the governance of frontier AI may require a separate approach at the development-stage, regardless of eventual uses. The work being led by the Frontier AI Taskforce and Frontier Model Forum will be important in researching and defining what evaluations, benchmarks and standards could be implemented. Currently, the White House Commitments provide a good baseline for an international approach to safety, and more research will enable their development further. Cooperation between countries on these commitments will be vital for developing a shared, global approach, and the AI Safety Summit can contribute to driving this.
We are already making progress on these commitments; we have recently launched a beta version of SynthID, a tool for watermarking and identifying AI-generated images. We have also established dedicated AI red teams to explore risks that threat actors may pose to our leading models, and to the products that build on them. We look forward to working closely with regulators, government central functions, civil society, academia, and others in the industry, to explore AI model trajectories further and research appropriate mitigations techniques where necessary.
Summary:
● Large language models (LLMs) and other advanced AI offer huge opportunities and are already bringing benefits to people around the world, such as the scientific breakthroughs made through Google Deepmind’s AlphaFold.
● Google and Google DeepMind are developing our next model, Gemini, and exploring the potential of breakthroughs, such as using Med-PaLM 2 for medical purposes.
● LLMs have known risks today; Google and Google DeepMind are already working on several protocols and technologies for managing them, such as watermarking and other detection techniques to identify AI-generated content (including using the power of AI itself).
● As capabilities increase, novel risks could emerge and effective testing and evaluation will be critical.
● The Government’s proposed horizon scanning function will be essential for policymakers to keep pace with innovation and capabilities and understand potential risks, ensuring that their response keeps people safe but does not stifle innovation. Close cooperation with leading AI labs, civil society and experts will help meet this challenge.
LLMs and other AI models have advanced rapidly over the past few years, and are bringing many benefits. This has led to exciting innovations and capabilities are accelerating. Advanced AI can boost scientific discovery, innovation, economic growth, and enhance public service delivery. Research conducted by Public First for our recent economic impact report found that AI-powered innovation could create over £400 billion in economic value for the UK economy by 2030. The impact on the public sector could also be transformative, freeing over £8 billion in productivity gains for example by saving over 700,000 hours a year in administrative work for GPs and teachers and allowing them to focus on delivering front line services.
Google and Google DeepMind have been at the cutting edge of AI innovation and our models are already benefiting researchers, our customers and service users - from advancing state of the art AI models to subsequently applying them to major scientific discoveries and social challenges:
● AlphaFold has contributed to scientific research on a range of topics, from drug discovery to studying long-extinct species. With 200 million protein structures made available at once, which would have previously required four years of doctorate-level research for each protein, AlphaFold has potentially saved researchers trillions of dollars equating to around 400 million years of research. Today over 1.34 million unique users have visited the AlphaFold database, and the paper introducing AlphaFold, Jumper et al. (2021), has received more than 10,000 citations.
● PaLM 2 - the LLM that powers Bard - has improved multilingual, reasoning and coding capabilities and comes in a variety of sizes, which makes it powerful across a range of uses. It powers over 25 new Google products and features. For example, it is allowing us to expand Bard to new languages, as well as helping developers with programming. Google Workspace features are also tapping into the capabilities of PaLM 2, helping users write in Gmail and Google Docs and organise data in Google Sheets.
● Precipitation nowcasting model - working with the UK Met Office, Google DeepMind developed a model to better understand changing weather. This can hugely impact how we optimise renewable energy systems based on natural resources and offset some of the carbon emissions of training advanced models.
The coming years will likely see larger, more multimodal models with increased reasoning, planning, memory and agency. Advanced AI has the potential to continue having a positive impact on science and productivity in public services such as healthcare, among other areas.
● Our next-generation foundation model, Gemini, has been created from the ground up to be multimodal, highly efficient at tool and API integrations, and built to enable future innovations, like memory and planning. While still early, we’re already seeing impressive multimodal capabilities not seen in our other prior models. Once fine-tuned and rigorously tested for safety, Gemini will be available at various sizes and capabilities.
● We are developing and testing our Med-PaLM 2 which harnesses the power of Google’s LLMs in the medical domain to more accurately and safely answer medical questions. Industry-tailored LLMs like Med-PaLM 2 are part of generative AI technologies that have the potential to significantly enhance critical services, like healthcare experiences, in the future.
Alongside opportunities, there are likely to be risks with any new technology. Some of these are already known issues, such as bias, toxic language and misinformation. However, there is also uncertainty about potential future risks. Both known and uncertain risks demand the attention of industry and governments. It has long been the view of Google and Google DeepMind that industry, policymakers, academia and civil society must simultaneously focus on both immediate and future concerns. Indeed, efforts to address the former will inform efforts to understand and address the latter.
Forecasting future progress and anticipating impacts is challenging but vital. There are many approaches we can take - some more established than others - and important roles for industry, government and civil society.
Predicting both future progress and impacts can be challenging. The abilities of models in three years time may be very different to those presented today. It is therefore important that we have an innovation ecosystem and governance model that can anticipate change and adapt quickly. This should include a combination of foresight exercises, horizon scanning, and research, testing and evaluation. There are several approaches to take and, for all approaches, there must be a close relationship between industry, government and civil society.
Foresight exercises: In 2021, Google DeepMind developed a taxonomy of risks presented by LLMs. And earlier this year, Google published a paper introducing a taxonomy of harm reduction. The purpose of these efforts was to look ahead and suggest frameworks that could help us to collectively understand risks and harms in the landscape. Risks include discrimination and representational harms; exclusion and toxicity; information hazards; misinformation harms; malicious uses; human-computer interaction harms; automation, access and environmental harms; allocative harms; quality of service harms; interpersonal harms; and social system harms.
This approach has allowed Google and Google DeepMind to develop several concrete mitigations. For example:
● Identifying and tracing provenance to tackle misinformation and disinformation: Being able to identify AI-generated content is critical to promoting trust in information. While not a silver bullet for addressing the problem, we’ve recently launched a beta version of SynthID, in partnership with Google Cloud, a tool for watermarking and identifying AI-generated images. SynthID is an early and promising technical solution to this pressing AI safety issue. This technology embeds a digital watermark directly into the pixels of an image, making it imperceptible to the human eye, but detectable for identification. SynthID is being released to a limited number of Google Cloud customers (at first) using Imagen, one of our latest text-to-image models.
We are ensuring that all of our AI-generated images have a markup in the original image file, which can help give users context if they come across the image outside of our platforms. Creators and publishers will be able to add a similar markup to their own images, so they’ll be able to see a label in Image Search which indicates the images as AI-generated. In addition, through our “About this Image” tool in Google Search, users will be able to see important context and later this year, users will also be able to access the tool if they’re on websites in Chrome or when they search for an image or screenshot using Google Lens.
● Detecting synthetic audio to prevent fraudulent and malicious use by bad actors: Over the last few years, there’s been an explosion of new research using neural networks to simulate a human voice. These models can generate increasingly realistic, human-like speech. While the progress is exciting, we’re keenly aware of the risks this technology can pose if used with the intent to cause harm. Malicious actors may synthesise speech to try to fool voice authentication systems, or they may create forged audio recordings to defame public figures. Perhaps equally concerning, public awareness of "deep fakes" (audio or video clips generated by deep learning models) can be exploited to manipulate trust in media: as it becomes harder to distinguish real from tampered content, bad actors can more credibly claim that authentic data is fake.
We have released a dataset of synthetic speech in support of an international challenge to develop high-performance synthetic audio detectors, which was downloaded by more than
150 research and industry organisations. We’re also making progress on tools to detect synthetic audio: in our AudioLM work, we trained a classifier that can detect synthetic audio in our own AudioLM model with nearly 99% accuracy.
● Identifying and filtering toxicity in AI models: AI models can create unfair discrimination and representational and material harm by perpetuating stereotypes and social biases, i.e. harmful associations of specific traits with social identities. Social norms and categories can exclude or marginalise those who exist outside them. Where a model perpetuates such norms, such narrow category use can deny or burden identities who differ. Toxic language can incite hate or violence or cause offence. Finally, an LLM that performs more poorly for some social groups than others can create harm for disadvantaged groups, for example where such models underpin technologies that affect these groups.
Google’s Perspective API is being used to filter out toxicity in LLMs. Academic researchers have used the Perspective API to create an industry evaluation standard. Today, all significant LLMs in the world use this standard to evaluate toxicity generated by their own models. This has given us an understanding of what’s possible and has allowed us to develop thoughtful mitigation strategies to help identify content created by generative AI.
Horizon scanning: The UK Government’s AI Regulation White Paper proposed establishing a horizon scanning function, which we believe will be important for understanding model trajectories. In setting it up, it should be underpinned by a deep understanding of trends in technology, industry, and investment, as well as a specialist and multidisciplinary research team drawing on external input.
The new function will require close partnerships with experts in industry, academia and civil society and the initial focus should be on forecasting future trends and developments in both the technology and its impact. It will be able to assist in steering frontier AI safety research and effectively monitor the evolution of the models (for instance compute use and capability progress) as well as the rate of diffusion and the domains AI models are applied to.
Research, testing and evaluation: It is also important to have measures in place to test and evaluate models and identify any new risks that may emerge. The Frontier AI Taskforce should be closely working on supporting the research and development of advanced models and applying their benefits to public services and the economy. As model capabilities progress, undertaking model evaluations, reviewing alignment between model and human values, and defining industry best practices and standards will all help flag issues early.
Google is a founding member of the Frontier Model Forum (FMF), currently bringing together OpenAI, Microsoft and Anthropic, though more companies are likely to join. This new body is focused on ensuring safe and responsible development of frontier AI models and has four core objectives:
1. Advancing AI safety research to promote responsible development of frontier models, minimise risks, and enable independent, standardised evaluations of capabilities and safety.
2. Identifying best practices for the responsible development and deployment of frontier models, helping the public understand the nature, capabilities, limitations, and impact of the technology.
3. Collaborating with policymakers, academics, civil society and companies to share knowledge about trust and safety risks.
4. Supporting efforts to develop applications that can help meet society’s greatest challenges, such as climate change mitigation and adaptation, early cancer detection and prevention, and combating cyber threats.
Summary:
● We believe that the UK’s regulatory approach to AI strikes the right balance between managing risk and enabling innovation; the context based approach has several advantages, including responding rapidly to technological innovations and using the strengths of existing regulators.
● Several regulators are already beginning to undertake work. While this is positive, it is vital that all regulators are supported with the right resources and skills.
● Regulators will be supported by the central functions, but this should include guidance through the development of a ‘Risk Management Framework’. This would provide a common framework for industry, regulators, academia and civil society for managing and mitigating risk.
● To further support the implementation of the framework, we would suggest accelerating a gap analysis of existing laws.
● The Government should convene multistakeholder conversations about what standards may be appropriate for open-sourcing powerful AI models, in a way that effectively mitigates risks but doesn’t unduly stifle innovation.
● We support the Government’s decision to establish the Frontier AI Taskforce to look at the capabilities of advanced models, and develop evaluation methods.
● This work, alongside industry efforts as part of the FMF, are essential before decisions are made on codifying safety measures on these models - more research is needed.
The AI White Paper
We believe the UK’s overarching approach to regulating AI, as set out in the Government’s AI regulation White Paper, provides an adaptive framework for managing risk while promoting innovation. By empowering existing regulators to uphold cross-cutting principles, the framework provides tailored, coherent and effective regulation.
We support the UK’s ‘hub-and-spoke’ approach to regulation - with existing regulators developing rules on the use of AI in their domains, and a central AI governance function responsible for guidance, coordination, and horizon scanning across the whole system. The UK’s White Paper is a good blueprint for this kind of framework.
This context-based approach, led by individual regulators, has several other advantages in regulating the AI we have today:
● It encourages and empowers existing regulators to provide quicker responses to fast-moving technology, and avoids unintentionally slowing down innovation or societal responses to risks;
● It will help to ensure regulation is tailored to the many different contexts where AI is used - and informed by deep domain expertise of the regulator responsible;
● It provides a means for regulators to engage with the communities of practice in their sectors - whether this is healthcare professionals or teachers - as well as wider civil society groups representing those impacted by the sector-specific use of AI;
● It will ensure regulation is tailored to the dynamics and complexity within AI supply chains (for example, the difference in AI systems used in specific applications, or the number, distribution and diversity of actors within AI supply chains);
● It will create a system proportionate to the risk of harm and sensitive to the context of deployment, recognising there will be an increasing number of use cases in the future where risks will be associated with not using AI; and
● It will be built on the strengths of existing regulatory approaches and progress made to date, where they have already begun work on sector-specific AI regulation.
We are already seeing regulators putting this into practice, setting out plans and expectations for how AI will be regulated in their domains: the Digital Regulation Cooperation Forum (DRCF) recently noted that DRCF regulators are already empowered to address areas such as generative AI and that the same is true for many other regulators outside the DRCF. The Information Commissioner’s Office (ICO), Ofcom, the Competition and Markets Authority (CMA), the Financial Conduct Authority (FCA), the Equalities and Human Rights Commision (EHRC), and notably, the Medicines and Healthcare Products Regulatory Agency (MHRA) are putting this into practice and setting out their plans and expectations for how AI will be looked at. For example, the CMA is conducting a review into AI foundation models. We welcome the CMA’s open approach and willingness to engage with and learn from the industry as part of its review. Additionally, the MHRA has outlined how existing laws that govern medical devices can regulate LLMs developed for medical purposes through the Medical Device Regulation 2002. The MHRA’s approach could serve as a basis for best practice methods which can support the development of similarly clear rules in other domains - this could be disseminated both through the DRCF and with other medical device regulators globally.
We would also like to see the UK develop a detailed Risk Management Framework (RMF) to provide regulators - as well as those developing and deploying AI systems - with a common framework for how AI governance processes should identify, manage, and mitigate risks appropriately. The White Paper identified the importance of coherent application of the principles, and the central functions will go some way to support this, particularly as it promises the development of a risk assessment framework. However, we remain of the view that a centrally published RMF - which can be updated and iterated over time - would add value to both industry and regulators, as well as the research community and civil society.
The US’s National Institute of Standards and Technology (NIST) approach to developing a voluntary AI Risk Management Framework is a good place to start. A UK RMF should replicate the inclusive and iterative approach the NIST has taken, while ensuring it reflects the specific values of the UK. It could also be developed to include guidance on foundation models (including LLMs), which is currently missing from NIST’s RMF.
To further support regulators to implement the framework, we welcome the Government’s plan to identify gaps in existing laws where unique risks from AI are not sufficiently captured. A gap analysis should be informed by dialogue with developers, deployers, end users, and those affected by the use of AI - especially less prominent voices in current AI discussions. As emphasised by the DRCF, regulators will play a pivotal role in helping the Government identify gaps within their areas. Where significant gaps exist - i.e. risks introduced specifically by AI and not covered by existing substantive laws and regulations - Government should consult on proposals to close them.
The UK approach to Frontier AI
The sector-specific and application-focused governance and regulatory approaches are most appropriate right now; for the next generation of frontier models some carefully scoped development-stage controls may be reasonable regardless of the use. We have committed to such controls, along with other leading companies, as an important first step. Consideration of these sorts of controls should be seen alongside the context-based framework, which would still apply to powerful models when they are deployed in domain-specific contexts. In the future it may be appropriate to codify these measures into law; however much more work is needed to develop the standards, evaluations and other measures which would be the basis of any statute.
We think first-order priorities for industry, policymakers and academia to address frontier AI governance should include:
● Accelerating industry and academic research on frontier AI safety risks and promising technical mitigation approaches such as mechanistic interpretability, adversarial robustness and anomaly detection.
● Developing robust evaluations and benchmarks that, for example, test for dangerous capabilities and extreme risks (in addition to harms that have already been identified in existing systems).
● Establishing clear fora for aligning on priority risks, agreeing best practices and industry standards for safe frontier AI development and deployment - with opportunities for broad stakeholder input and challenge.
● Facilitating structured and secure frontier model access protocols and mechanisms to enable external safety researchers and other trusted third parties to red-team models, and conduct evaluations to assess how capable and controllable systems are.
● Policymakers convening industry experts, academia and civil society to explore recommendations for managing frontier AI risks through national and international approaches.
The Frontier AI Taskforce is making progress on developing frameworks for evaluations and guidance for responsible practices. This Taskforce work, alongside industry efforts such as those of the Frontier Model Forum, are essential precursors to any moves from the Government to put controls in place; more research is needed.
As we consider additional safeguards for powerful AI models, many open questions remain. As part of a comprehensive approach to ensuring safety and security, we believe we need a multistakeholder discussion on modalities of access to model capabilities, including via open source models.
We are committed to open source technologies and have a long track record of supporting the open source community. An open ecosystem enables creativity and innovation in the AI space, while encouraging transparency and responsible practices. For example, Google originally developed TensorFlow in-house and later released it as an open source software library for building AI. Google has also contributed to openly available models (e.g. BERT, T5) which have led to innovations by the AI community (e.g. RoBERTa).
However, open access to the model weights for advanced AI systems raises safety challenges. The power of such systems could, for example, be misused by bad actors to spread large-scale misinformation and discrimination, manipulate media, and carry out increasingly sophisticated cyber-attacks by hostile actors. Once a model is openly available, it is possible to circumvent any safeguards, and the proliferation of capabilities is irreversible. To mitigate this risk, organisations should assess the potential benefits and risks of each model. A multi-stakeholder discussion is needed to explore standards around when models should be openly available, and when they should not.
Summary:
● Governing AI is a technical as well as a regulatory challenge so it needs a close partnership between researchers, policy makers and regulators.
● Google and Google Deepmind have long been committed to developing AI responsibly, including large language models.
● Industry, government and civil society should be working together on safety research, standards, assurance and developing technical tools.
Governing AI is a technical and regulatory challenge, and much progress in making AI fair, beneficial and safe will come from research breakthroughs and new techniques. Non-regulatory measures can enable a collaborative approach to governing the risks from AI. They also provide a faster avenue than legislation to address risks of powerful AI, which is particularly important when technological advances are moving at pace. As non-regulatory approaches and mitigations mature, it may of course be appropriate in some instances that they are backed by legislation.
Google and Google DeepMind are committed to developing AI responsibly. Google was one of the first companies to publish a set of AI principles, and we use an AI risk assessment framework to identify and mitigate risks. These principles determine our commitment to prioritising widespread benefit from our research, as well as the areas of research and applications we choose not pursue.
There are several categories of non-regulatory measures we believe should be prioritised to support and bolster the effective governance of LLMs and other advanced AI systems:
1) Safety & ethics research. Companies and governments must continue investing in research advancing social and technical AI safety and ethics solutions, as well as monitoring and understanding their impact. There are positive initiatives already under way, such as the Frontier Model Forum, funding dedicated to secure and trustworthy AI research, £100m for the Frontier AI Taskforce, and the Responsible AI UK impact accelerator.
2) Standards. Industry-wide agreement on technical and ethical standards will help codify consensus on commonly agreed principles like transparency, accountability, and fairness when developing and releasing advanced AI models, including LLMs. These could be developed through international bodies like the IEEE and ISO, national ones such as the BSI, and other multilateral and public-private forums such as the OECD, MLCommons or the Global Partnership on AI. Standards could focus on terminology, process (e.g. for risk management and impact assessments), and performance (e.g. for models’ capabilities). These will not only prove useful for developers, but also for the auditing and procurement sector, by creating best practice and improving interoperability. Initiatives such as the Alan Turing Institute-led AI Standards Hub will help establish a common language.
3) Assurance. Independent third-party auditing and certification mechanisms can provide confidence that AI systems meet standards and principles. We support third-party testing, evaluation and validation of real-world performance. We suggest considering the evaluation of different aspects of models (for example, looking specifically at the training datasets, measuring them against key metrics around bias and fairness) in a holistic manner across the LLM’s full life cycle. This should include testing methodologies, transparency, human oversight, post-deployment monitoring, and risk assessment processes.
The Centre for Digital Ethics and Innovation’s (CDEI) Portfolio of AI Assurance Techniques showcases how industry and academia are already using assurance methods across a variety of sectors and applications. These contribute to the UK’s AI Assurance Roadmap, which envisages a business and legal services ecosystem to provide audits and assurance on AI models. This is also important in public services, particularly in areas such as policing, defence and welfare services provided by local authorities.
4) Responsible AI practices. The growing repository of responsible AI toolkits can help mitigate both known and unknown harms of advanced AI. These include technical tools such as watermarking, accountability and transparency disclosures, and model testing and evaluation methods, model cards, and other methods embedded within softwares and systems:
● Technical tools. Watermarking (such as SynthID) and bias and/or toxicity checkers (as with Google’s Perspective API) can enhance transparency and accountability across AI systems.
● Accountability and transparency disclosures. Model cards and system cards are essentially ‘instructions’ for AI models and systems (and might include information on, for example, the purpose of the model, the data that the model is trained on, its performance, and potential risks/limitations). These information disclosures can support transparency and accountability throughout the AI life cycle, and provide both industry and regulators with clarity on compliance measures. The scope of any disclosure and accompanying documentation should be calibrated to the context and what different audiences require, which will vary between developers, end-users, and integrators.
● Testing and evaluation methods. Red teaming, which consists of simulating threats targeting AI deployments, can also help to operationalise safety in AI models. At Google, red teaming activities are part of our Secure AI Framework (SAIF) and consist of: assessing the impact of simulated attacks on users and products, and identifying ways to increase resilience against these attacks; analysing the resilience of new AI detection and prevention capabilities built into core systems, and probing how an attacker might bypass them; leveraging red team results to improve detection capabilities so that attacks are noticed early and incident response teams can respond appropriately; and raising awareness among relevant stakeholders to help developers who use AI in their products understand key risks and advocate for risk-driven investments in security controls as needed.
● Structured model access, with the right protocols and mechanisms, can enable external safety researchers and other trusted third-parties to red-team models, and conduct evaluations to assess how capable and controllable systems are.
In developing these practices, it is important that the voices of stakeholders from across civil society are heard. For example, Google DeepMind worked with the Aspen Institute to convene experts from across academia, civil society and industry in a series of roundtables which explored the concept of Equitable AI. The report explores how considerations of fairness and equitable outcomes from AI can be addressed at different stages in the AI life cycle, and how they can inform governance methods.
Non-regulatory approaches can also be effectively deployed to minimise the risks of unknown and extreme risks. For example, researchers from Google DeepMind - along with collaborators from OpenAI, Anthropic, the Centre for the Governance of AI and others - have proposed a framework[1] for how model evaluations for extreme risks should feed into important decisions around training and deploying a highly capable, general-purpose model. The developer conducts evaluations throughout, and grants structured model access to external safety researchers and model auditors[2] so they can conduct additional evaluations.[3] The evaluation results can then inform risk assessments before model training and deployment.
Important early work on model evaluations for extreme risks is already underway at Google and Google DeepMind and elsewhere. But more progress – both technical and institutional – is needed to build an evaluation process that catches all critical risks and helps safeguard against future, emerging challenges. We look forward to continuing discussions with the UK Government and the Frontier AI Taskforce on how this can be taken forward.
● International cooperation on AI safety and governance is crucial.
● The White House Commitments provide an excellent basis for an international framework for AI governance.
● We welcome the UK Government’s leadership and efforts to create an opportunity to achieve this via the AI Safety Summit.
The AI White Paper highlighted the ‘complex and cross-border nature of AI supply chains’. The overarching approach to AI governance and safety needs to be international in order to be effective. Achieving international consensus on potential risks is an important precursor to establishing shared standards, regulation and best practices, minimising regulatory divergence and enabling trust in the development and deployment of models across borders.
Leading companies have signed up to the White House’s voluntary industry commitments on AI, which are a milestone in bringing companies together to make sure we develop AI responsibly and in ways that benefit everyone. In just a few months, the commitments have been a catalyst for agreeing a common framework for responsible AI development that is safe, secure, and trustworthy.
The leading companies are committing to: ensuring AI products are safe before introducing them to the public (with internal and external security testing of the AI systems prior to their release, and sharing information with governments, civil society and academia); building secure-first systems, whose model weights are protected (by investing in cybersecurity, threat safeguards and third-party reporting of vulnerabilities); and earning public trust by continuing to develop technical tools such as watermarking, issuing transparency reports (disclosing models’ capabilities, limitations and clearly outlining models’ appropriate use), prioritising research on known risks such as bias and discrimination, and by deploying AI to address society’s greatest challenges. We believe that these commitments are a strong foundation for international collaboration.
As the homes of many of the frontier AI labs, the UK and US should work closely together and harmonise approaches where possible. For example, the UK could adopt a Risk Management Framework that aligns with the US NIST approach. There is a clear opportunity for the UK to drive international cooperation on these issues and the Atlantic Declaration’s commitment to accelerating cooperation on AI and the establishment of a US-UK Strategic Technologies Council is important.
A cohesive approach is crucial, and while the UK seeks to drive consensus at the forthcoming Summit, it will be important to keep working closely with other governments and international institutions such as the G7, OECD, UN, and GPAI. The UK should ensure that the work leading up to the Summit is closely coordinated with the G7’s Hiroshima Process and emerging discussions in the EU-US TTC.
Building international coordination and collaboration will also require access to a broad set of skills, diversity of background, and new forms of collaboration – including scientific expertise, socio-technical knowledge, and multinational public-private partnerships. This will be particularly important when researching and assessing future trajectories, AI safety, and evaluation.
September 2023
18
[1] Toby Shevlane et al. Model evaluation for extreme risks (24 May 2023)
[2] Jakob Mokander et al. Auditing large language models: a three-layered approach (16 february 2023, revised 27 June 2023)
[3] Inioluwa Deborah Raji et al. Outsider Oversight: Designing a Third Party Audit Ecosystem for AI Governance (9 June 2022)