HoC 85mm(Green).tif

 

Public Administration and Constitutional Affairs Committee 

Oral evidence: Transforming the UKs Evidence Base, HC 197

Tuesday 5 December 2023

Ordered by the House of Commons to be published on 5 December 2023.

Watch the meeting 

Members present: Mr David Jones (Chair); Ronnie Cowan; Jo Gideon; John McDonnell; Damien Moore; Tom Randall; Lloyd Russell-Moyle; John Stevenson

Questions 101 - 150

Witnesses

I: John Edwards, Information Commissioner.

II: Reema Patel, Head of Deliberative Engagement, Ipsos UK; and Gavin Freeguard, Policy associate, Connected by Data.

Written evidence from witnesses: ­

Information Commissioner

 

Examination of witness

Witness: John Edwards.

Q101       Chair: Good morning and welcome to this meeting of the Public Administration and Constitutional Affairs Committee. Today, the Committee will be taking evidence in its third oral session in our inquiry on transforming the UKs evidence base. The inquiry is looking at how officials produce statistics and analysis, how demands for data are changing in the modern data-driven world and whether the privacy of citizens is being adequately protected as new and innovative sources of data become available to decision-makers.

Our witnesses this morning are spread across two panels. Our first panel today is Mr John Edwards, who is the Information Commissioner. Good morning, Mr Edwards, could you introduce yourself?

John Edwards: Good morning. I am hearing a bit of feedback.

Chair: There is a bit of feedback. We will fix that.

John Edwards: Is that distracting? That is remedied? I will start again. Kia ora is a traditional New Zealand welcome. Good morning, my name is John Edwards, I am the Information Commissioner for the United Kingdom, which means I have jurisdiction over eight or nine—I am not quite surestatutes and regulations involving data, including freedom of information, in the environment context as well, and a variety of data protection rules.

I am very grateful for the opportunity to inform your inquiry on the very important matter of the use of statistics as an evidence base.

Q102       Chair: Thank you, Mr Edwards. I will start by asking if you could describe the current framework for protecting the data and privacy of citizens?

John Edwards: Yes. The principal legislation that my office administers is the UK General Data Protection Regulation, which is an adaptation of the General Data Protection Regulation in the EU. It is adopted into UK law by means of the Data Protection Act 2018. The legislation is technology neutral, and principles based. That is very important for you to bear in mind as you continue your inquiry.

Another key principle is that it applies to personally identifiable information. If a dataset, for example, does not contain personal information or is not identifiable as such, in that all identifiers have been stripped out and the dataset is rendered anonymous, the legislation does not apply to it. If a dataset is rendered pseudonymous, that is the data is masked but is able to be reconnected with the underlying, identifiable data, it is personally identifiable information and remains subject to the UK GDPR.

The principles of law require adherence to concepts such as transparency so that data subjects, which is the individuals who make up the datasets, are informed about the purposes for which information is to be used and the like, that in certain circumstances consent can be required for secondary uses of data. There are obligations on the regulated parties, which are described in the law as data controllers, for fairness and accuracy of the data. You can see when I use concepts like fairness and accuracy, there is a general expectation that personally identifiable data is kept in a secure way. This is a law that is set at a level of generality that enables it to cover an infinite variety of data transactions that occur throughout the economypublic sector, private sector and voluntary sectorin relation to mundane and highly sensitive information. With that general scene setting, I think perhaps it might be more productive for me to open myself to your questions and inquiry.

Q103       Chair: That is my next question. Thank you for that, that was very comprehensive. Could you explain the ICOs role in regulating compliance under that framework?

John Edwards: Yes. The ICO is the principal regulator. We are an independent data protection authority. We are obliged to respond to complaints. We are able to undertake audits. We can begin investigations or inquiries of our own motion. We receive something like 32,000 complaints a year through our enquiries line. Those are dealt with at a fairly basic level.

There are some hundreds of matters that we formally investigate. There are dozens of audits of different scales and that is at the enforcement end, that demand-driven activity when people draw matters of concern to our attention and ask us to look into them. We call that our downstream regulatory activity, but equally important is our upstream activity. To summarise my regulatory philosophy: that is meeting the need of ease of compliance.

The single most important determinant of compliance in any regulatory regime is ease of compliance. It is incumbent on the ICO to ensure that people know what the rules are, which can involve a degree of complexity when we are dealing with a principles-based, very general legislation that has to be applied in quite specific technical circumstances. We produce guidance. In this area we have produced guidance in relation to privacy enhancing technologies that might enable the interrogation or use of large datasets for research purposes while preserving the privacy. For example, we have produced guidance on anonymisation and pseudonymisation. Those are two upstream elements.

Another is that we have an obligation, under the UK GDPR Article 36.4, to provide input when we are consulted by Government agencies or other organisations on significant and innovative uses of data. In that capacity, we comment on legislative proposals, on proposals to make a novel use of technology to undertake a processing activity and we work with organisations, for example, such as the Office for National Statistics on its integrated dataset, and provide advice and input into how those organisations can achieve their objectives of providing trusted data in ways that ensure privacy and data protection values are respected and observed.

Q104       Chair: You told us in your written evidence that the ICO is increasingly working with organisations to improve their compliance before undertaking formal intervention. Could you tell us more about that approach in the context of statistics and research specifically?

John Edwards: Of course. Thank you for the invitation to do so because I am quite proud of that work. In my view, if we facilitate the responsible use of data and statistics and assist statistics holders to meet their objectives, we can get ahead of potential harms and we can avoid things that I have seen in other jurisdictions. I would be happy to elaborate on that if members would find it interesting.

We can provide our technical input and provide advice to ensure that the very important public policy objectives of the use of statistics are met without putting in peril personal data and also the trust relationships that the people of the UK are entitled to maintain with the Government agencies with whom they interact.

A couple of other ways in which we do that is we run what is called a sandbox, which means if organisations wish to undertake a novel exercise with data or combine datasets and work with multiple agencies, and are not sure how data protection law applies, they can apply to join our sandbox. We workshop through what their objectives are, how they are proposing to go about meeting those, what risks they might encounter in doing so and how they might mitigate those risks. They come out the end of the sandbox exercise well prepared to responsibly use the data with which they have been entrusted.

At a next tier down, we offer a service called an innovation advice service in which we guarantee that if you bring us a query about how the law applies to a particular scenario that you are wanting to implement, we will provide a response to you and an answer of how we see that within, I think, 10 working days.

Again, my view is that it is a principal role of a regulatorparticularly in a principles-based legislation of such general application as mineto provide as much certainty as to the law as can be done without compromising our ability to come in after the fact and objectively assess the conduct of a data controller against the rules that we expect them to have applied.

Q105       Chair: How long have you been adopting this approach?

John Edwards: For some years prior to taking up the position with the ICO. I was very pleased when I came to the ICO with this philosophy from New Zealand to find that I was joining a stream that was already underway. I was pushing at an open door, in effect. It is not the case with all regulators, I have to say, but I was very pleased that the current was well suited to my approach and the organisation has come in behind that facilitative approach.

It does need to be met with an equal focus on identifying non-compliant conduct and applying the sanctions regime provided in the law to that conduct, particularly where that is undertaken recklessly without regard for individual rights or dishonestly and criminally in order to gain a competitive advantage in the commercial sphere.

Q106       Chair: You also told us in your written evidence that protecting information from harm and unlocking its potential should not be seen as conflicting aims, could you expand on that?

John Edwards: I rail against false dichotomies, which we sometimes see presented to us as part of a debate on an emerging issue. We might see a proposal to keep the community safe by increasing surveillance cameras or some deployment of a technology and we are presented with a false choice: do you want privacy, or do you want security? My response is: we want both. We do not have to trade off. We want security with privacy.

In this domain, I would say we are entitled, we are able, to have innovation. We are able to release the value of data while still protecting the individual privacy of the people who comprise that data. We do not need to trade one off against the other. We canand the people of the United Kingdom are entitled toexperience both.

Q107       Ronnie Cowan: Just on that last point you made there about privacy and security. Do you work with the Home Office?

John Edwards: Yes.

Q108       Ronnie Cowan: Do you advise it on the difference between privacy and security?

John Edwards: Yes.

Q109       Ronnie Cowan: If it is going to be investigating people, as it does do, to make sure they are not a threat to our society, at what point do you say you can no longer delve into that persons information?

John Edwards: Let me first background my response by saying the Home Office and my office have different roles in society and it is very important that we mutually respect those different roles. There will be points of difference and there will be times at which the Home Office is entitled to say, Well, we have heard you, Commissioner, and we believe that this response remains proportional and is something that is necessary. There are accountability mechanisms through into the Parliament to enable that.

Second, at a policy level, we will provide input about what is a proportional response. We will test assumptions; we will test evidence that organisations like the Home Office provide. One of the key aspects of the General Data Protection Regulation is an accountability principle. Therefore, we will say, Okay, we understand the assertion but show us the workings. The people of the UK are entitled to see, for example, the Data Protection Impact Assessment that backs up an assertion that the initiative proposed is proportionate and that the risks have been identified and mitigated. We engage in this

Q110       Ronnie Cowan: At the end of the day, if the Home Office say, We are doing our own thing you cannot stop it?

John Edwards: I can, in fact. One more caveat before I directly answer that question. There are exemptions to the data protection principles to enable law enforcement activities. You would not expect the same principles of transparency and the like to apply in covert, intelligence operations, for example, as apply in the direct marketing industry. However, I have powers, for example, to issue a stop processing notice to warn an organisation if it appears that it intends to undertake an activity that is contrary to the law.

I am prepared to deploy those against whichever organisation I have been unable to influence by other means. When I do so, it will only come after I have sat with the most senior administrative officers in that organisation and made certain that we understand our respective positions. If it became necessary to issue a notice to the Home Office to say, We have been through all of this. I believe I have no option but to issue an enforcement notice requiring that you cease this processing that would be an enforceable instrument.

Q111       Ronnie Cowan: We are living in in a changing environment and over the course of our inquiry we have heard that analysts across Government are accessing new sources of data and increasingly linking datasets. Is this making your job in assessing compliance against data protection legislation more challenging?

John Edwards: No, I dont believe so. We have to be agile and cognisant of the new technologies that are available. I think we all want Government to be able to act more efficiently, to direct their resources more effectively. Very often, new technologies present opportunities to do that. We stand ready to assess how those new technologies are being deployed but we need to move quite quickly and be prepared to recognise these potentials.

Artificial intelligence represents a very significant leap forward in public authorities abilities to process large amounts of data. It is an area where there are also data protection risks. We have invested quite heavily in providing guidance, our opinions and the downstream supervision by going into organisations and assessing and auditing their deployments of this new technology. It is just business as usual for us, Mr Cowan. It is a new technology, but we have to pivot and be there ready for it.

Q112       Ronnie Cowan: My concern is about the individuals privacy. Given the amount of data that is posted by individuals online, how can public sector bodies ever be truly confident that an anonymised analysis they publish might not in the future be pieced together to identify an individual?

John Edwards: You have really hit on a frontier debate, which has occupied the minds of people in positions like mine around world, which the United Nations Special Rapporteur on the Right to Privacy opined on. There is one stream of thought in this area that it is impossible to effectively anonymise a dataset for all time, for the very reasons that you describe.

There are technologies now that, by the accretion of more and more datasets, increase the likelihood that you will be able to pinpoint and extract one individual. That is not even taking into account the potential that quantum computing brings to that, so these are very real and very significant challenges.

I operate in a world that obliges me to look at what is reasonable in the circumstances, to recognise the kinds of risks that you describe and to see if we can mitigate those so we can allow organisations to get the value of their datasets now and preserve against that risk in the future. One of the mechanisms is by looking at the retention schedules that they might have. They may bring together datasets for a research project today and then disassemble that, so it is not sitting there able to be attacked by a quantum computer in 10, 15 or 50 years time in a way that enables it to be re-identified.

Q113       Ronnie Cowan: Surely if we are going to stop quantum computing and AI doing what we have just discussed, we have to stop gathering the data? If you continue to gather data, not only old stuff but putting fresh stuff out there, quantum computers and artificial intelligence are going to be able to do what we don’t want them to be able to do.

John Edwards: I think I mentioned in my opening comments privacy enhancing technologies. These are called PETs for short. There are a range of them, and we have issued guidance, and we work internationally with colleagues to propagate these. They include things like homomorphic encryption, which enables research to be undertaken on an encrypted database without decrypting it. The research happens behind that encrypted shield.

Another is called differential privacy. I cannot begin to understand or explain to you the mathematics behind these, but I am told that these are mathematically sound techniques to add statistical noise into a dataset to reduce the likelihood of re-identification. These are mitigation techniques. No one of these techniques on its own is going to eliminate the challenge you described.

Q114       Ronnie Cowan: Alan Turing was told during the Second World War, “This code is unbreakable. You will never get in there. He had one pretty basic computer, and he did it. Now we are looking at AI and quantum computing.

John Edwards: Quite so. That is why we cannot look for a silver bullet. The ultimate

Q115       Ronnie Cowan: All I am saying is we cannot hide behind encryption. If we are not gathering the data, it is not there to analyse in the first place. It is the only guarantee you have that people cannot be individually identified.

John Edwards: That is a fairly absolutist principle. What I would say is that if you rely on one mechanism to protect data only, I think you are taking a risk, as confident as you may be in that technology. We should never rely on just one technology. The concept behind the integrated datasetI think it is calledor research environments that are being established is that you see the idea of five safes, which do not rely just on one technology. You have safe people. You only let approved people have access to data. Safe technology, the use of encryption. Safe environments, safe projects. You only allow projects that have been vetted and approved through a robust ethical framework to have access. Through these mechanisms, I think you mitigate those risks. Do you eliminate them? I dont think so. To eliminate themI think you are probably rightthe only approach is to not bring the data together. I am not sure that that is viable in todays economy.

Q116       Jo Gideon: How well do those Departments that collect, store and link data for statistical and analytical purposes comply with data protection legislation?

John Edwards: There is a definitional question in there. If I take it at its narrowest, the organisations that collect data solely for statistical purposessuch as the Office for National StatisticsI think have established a very high level of trust with the community and part of that is about establishing a very high level of compliance with data protection. They understand better than many others that their legitimacy, their licence to operate, depends on the maintenance of a high level of trust, regardless of whether they are compliant with a legal standard. The business imperatives are business critical to them.

At a broader level, every Department of Government collects data for statistical purposes. Even if it is collecting data for its principal operational purposes, there is a secondary use of that administrative data for statistical purposes. There the report card is much more mixed, I have to say. There are organisations with very high data volumessuch as DWP and HMRCand from time to time we do see challenges. However, what I am interested in is working with those organisations at a central level to improve the levels of maturity. At the centre, we have a high regard for data protection and a recognition of the importance of protection.

Where we see non-compliance is often right out at the front end. It is the person who has picked up five letters from the photocopier and put them into an envelope and sent it to the one person rather than into five different envelopes. It is the person who puts the USB into their device to download some witness statements to give to another party and then forgets and throws it in their gym bag. It is that level of human frailty, which is a constant battle, where I think we must increase our efforts to learn from the mistakes of others and to create a more virtuous cycle of sharing information, of implementing improvement. I have been engaged in that with public sector authorities for the last 18 months.

Q117       Jo Gideon: Over the course of our inquiry several witnesses have suggested that standards are strong in the area of statistics, which are regulated, but rather weaker across other Government analysis. Have you found this to be the case in the context of data protection compliance?

John Edwards: Yes, I think I have. We have struggled with this. To the Chairs second question: how do we enforce compliance? We have an enforcement function, but we have found that it can be very much like whack-a-mole. I have teams out there that will pick away at something for three years to investigate and issue a penalty. I am not sure that is a very efficient way of working so we have tried to work in a more proactive way to raise the standards across the sector, but we do have a long way to go.

Q118       John Stevenson: You mentioned a high level of trust and you also mentioned a high regard for data protection. Do you think public sector bodies are sufficiently transparent about how they use personal data for their production of analysis?

John Edwards: Again, I would say there is a mixed scorecard on that. I think there are improvements to be made. I wonder if you would just indulge me for a moment while I refer to some notes that I have with me on a report that we have produced on publication. There is an obligation to publish datasets and we found a very mixed record across Government.

We issued a report “Publication schemes: a snapshot of compliance”. That is available on our website ico.org.uk. I will ask my team to forward that to the inquiry. It found that while all central Government Departments that we sampled have adopted our model publication scheme, only a minority was publishing in accordance with it. We found that only six of the 20 Government Departments sampled were publishing their datasets.

The answer to your question is: I think there are significant improvements to be made about being clear with the public on how their data is used and what data is available. I do think that is a very important part of the social licence for secondary use of public datasets. There is perhaps

Q119       John Stevenson: Are you saying that Government Departments have trust when it comes to protecting data but there is a lack of transparency about the use of it?

John Edwards: I think there are improvements. If you look at initiatives like the GPDPRexcuse me for forgetting the abbreviation—there was an initiative to publish datasets from general practitioners and there was a significant backlash against that. It would have been an enormously valuable dataset but a failure to prepare the ground with the public, to demonstrate the value that would return into the system, meant that there was a lack of trust and there had to be a backing down from that. That was unfortunate.

It wasnt the only one. We saw also care.data, a similar phenomenon. I do think that it is important to take the time to trust that people have the ability to recognise that there is a public good in some of these initiatives to show that that public good can be provided without putting individual privacy at risk.

Q120       John Stevenson: The follow on from that, then, is that we have heard repeatedly that the data is not being shared effectively across Government Departments for statistical and research purposes. Is it your view that data protection is not a cause of this, it is other things that are causing that problem?

John Edwards: Yes, unequivocally. Data protection should not be an inhibition.

Q121       John Stevenson: You dont think it is an inhibition?

John Edwards: I havent seen any evidence that it is.

John Stevenson: It is other things that are causing that issue?

John Edwards: There are many things. I have seen examples of a required collaboration between organisation A and organisation B. B is a data holder that A wants data from but the value will accrue not to organisation B so it will not apply the resources to extract, process, clean, scrub and transmit the data to organisation A. There are cultural, data quality and technological issues—different sized pipes, to be crude. I have not seen evidence that the data protection regime is getting in the way.

Q122       John Stevenson: That is very helpful. A final question: there was a review published about data sharing under the DEA. What did you find through the course of that review?

John Edwards: If I can just refer to my notes again. We did make a number of recommendations. We found that the framework for data sharing provides a supportive background to help organisations share data in ways that benefit the public. The framework itself includes robust safeguards that ensure organisations share data responsibly and in alignment with the data protection principles, so also protecting rights. We did make some recommendations. We found that there was good support for data protection officers, which is important. We made recommendations about updates and improvements to the DEA codesincluding references to updated resources such as the data sharing code that we provide—providing further clarity about a specific definition in one of the DEA codes.

We have recommended supplementary guidance about the role of board members and the factors they should consider when reaching their decisions. We did not find significant matters for concern. We found that it was a useful framework. The review is there for Members at ICO-review-DEA 2023 on our website.

Q123       John Stevenson: In quick summary: data protection seems pretty good, but it is the use of that information by Government Departments that is coming up a little short.

John Edwards: Yes, that is a fair summary. If there are responsible proposals that deliver a public good in relation to data sharing, the law will support that.

Q124       John McDonnell: John, you have mentioned before the ONS currently constructing the Integrated Data Service. Obviously this is on behalf of the Government. What they tell us is that this is meant to be a central hub, high quality accessible data, which is used for Government analysis, devolved Administrations, external accredited researchers. Obviously it comes under the data protection regulations, so they have to consider data protection.

In the evidence we have received there have been some concerns expressed. I will give a couple of examples: medConfidential suggests that the scope of the service, so this split focus on acquisition position and making data securely available, introduces serious data protection risks. We also had UK Research and Innovation express its concern that the platform is being engineered in a way that favours Government users over researchers. What advice have you provided to ONS in the consultations that you have had around the construction of this new service?

John Edwards: To the second point first, I do not believe that is a data protection matter. To the first point, I am not sure that we

Q125       John McDonnell: It is still an access issue, isnt it?

John Edwards: It may well be. It is not necessarily a data protection issue. To the first point, I am not sure that we see eye to eye with medConfidential on this. We have seen nothing to impugn the process ONS has followed in establishing this. We found that ONS has been very receptive to our advice and guidance.

May I come back to the Committee with a written report on that? I think it would be better than my reaching for generalities. We can provide some useful reassurance that we have been there at quite a technical level.

Q126       John Stevenson: I think most people, and we, have welcomed the service being constructed. It is just making sure it has rights. If you can write to us on that, it would be very helpful.

John Edwards: Certainly. I can tell you that I have met with Sir Ian Diamond on a number of occasions, and we have discussed this. It is a matter of great interest to me because there is an internationally renowned equivalent in New Zealand called the Integrated Data Infrastructure, which is also maintained by Statistics New Zealand. It brings together the administrative data of almost the entire Government administration. The silos that separate health, corrections, law enforcement and education disappear in this. It is an enormously rich and closely guarded resource.

I described the five safes approach. That is very important, too, to providing reassurance that this data can be repurposed for these purposes. It is important to bear in mind this concept of repurposing. Very often these datasets are simply the byproduct of the delivery of Government services. Very often they are proxies. Sometimes they are very effective proxies, sometimes they are quite poor, depending on the research that is proposed. If you were to design a research project and say: what data do we need to establish this hypothesis? You would probably come up with quite a different set than is available through the administrative datasets.

Q127       John McDonnell: So we would not start from here?

John Edwards: I think that is true, but we do not have that luxury.

Q128       John McDonnell: Ian Diamond sees this as a major project, which I think we all do.

John Edwards: I would agree.

Q129       John McDonnell: That is why it is important that you share the advice that you have given, so we would welcome that written report.

John Edwards: Certainly.

Q130       Lloyd Russell-Moyle: Apologies for being slightly late during your introduction. I want to go back to some of the questions that you were answering, which was about how we protect personal data to not be identifiable. Is there a school of thought or is there any consideration to the argument that we will have to get used to some data being more public? We have been too ready to say that some data is private. For example, Norway publishes the tax summaries of every citizen, and you can get access to that.

Is there a case that we are just trying to swim against the tide and, rather than trying to keep some of the data private, we need to improve laws on the misuse of that data if it is in the public domain? People keep their addresses quiet for privacy reasons but that is because they are worried about harassment. Do we strengthen harassment laws and just be comfortable that peoples addresses are out there? With AI, is it just an impossible road that we are going down and we have to try to put the protections elsewhere?

John Edwards: I dont believe it is impossible nor do I believe we should not strive to deliver on the cultural norms that have put the UK GDPR and the Data Protection Act on the statute books. The UK and Westminster-based common law jurisdictions do not have cultural expectations similar to Norway, Finland and India in relation to the publication of tax records. They have expectations of privacy. Those expectations are reflected in the legislation. I am a statutory officer, so I am obliged to deliver on those as set out in the laws issued to me from this place.

The broader policy questions you are asking are very legitimate, but I will not opine on those because those are matters for you and your colleagues to come to conclusions on about where the appropriate level of legislative protection is. I do not believeas your question might imply on one reading—that the legislation requires me to be a King Canute in the face of advancing technology, which puts more data at our disposal.

Chair: Mr Edwards, thank you very much for appearing before the Committee today. You have said that you will be writing to us upon certain matters. Could we have that additional information as soon as possible? Thank you once again for coming here.

John Edwards: Thank you so much for your interest, your courtesy and very probing and intelligent questions.

 

Examination of witnesses

Witnesses: Reema Patel and Gavin Freeguard.

Q131       Chair: We will move on to our second panel today. We are joined by Reema Patel, who is Head of Deliberative Engagement at Ipsos UK and Gavin Freeguard, policy associate at Connected by Data. Good morning. I will ask you both, please, to identify yourself for the record. Would you like to start, Ms Patel?

Reema Patel: Happy to. I am a research director at Ipsos. Prior to that, I was at the Ada Lovelace Institute. I co-founded the Institute, which is an independent data ethics body. I am also a policy engagement lead for the ESRC Digital Good Network.

Gavin Freeguard: Thank you for the invitation. I am Gavin Freeguard. I am a policy associate at Connected by Data. We are a campaign to put community voice at the heart of data and AI governance. You will also find me as an associate at the Institute for Government think tank, where I previously led its work on data in Government and digital government, a special adviser at the Open Data Institute, a member of the public digital network and I have also worked for several other organisations, including the Ada Lovelace Institute.

Q132       Chair: Thank you very much for that. I will ask the first question: what is meant by the expression data ethics?

Reema Patel: I am happy to start with responding to that. The first concept I want to bring to the table is the data life cycle. Practitioners think about the life cycle starting from choosing to generate data all the way through to communicating the findings of data. They think about the generation of that data, the collection of that data, the processing of the data, the storage of the data, the management of the data, the use analysis of the data and communication. My simple answer is data ethics is about every stage that I have just described and embedded across those stages.

As a concept, ethics is thinking about the practice of these things and its impacts on people and society. When it comes to thinking about how data is generated there are important questions there about why it is generated, what purpose is it intended for and who is it seeking to serve. If we look at the other end of the spectrum, which is the communication of data, again what is the communication approach trying to do? How is it going to improve or create a more flourishing society? This is a useful framework to think about data ethics because it recognises that ethics is inherent to and embedded in the practice of data collection, creation, use and management.

Gavin Freeguard: Building on that excellent definition, the first thing to say is that there are slightly different definitions and that is something to bear in mind when we are approaching questions of data ethics. It can be very much a conversation about how those ethical standards apply in different contexts and equipping people with the tools to do that.

One of the definitions I think is quite useful is from the UK Statistics Authority, which says that data ethics is being able to show that researchers, statisticians and analysts have not only considered how they can use data, but also how they should use data, which I think some of us in civil society sometimes informally refer to as the Jurassic Park test—just because you can does not mean you should. Public acceptability, public benefit, social licence are concepts that include a democratic element and are important to what is acceptable in that space as well.

It is also worth saying that data ethics is an important way of looking at how we approach that appropriate use of data, but there are also other concepts that are quite important. Data governance is an important one that we talk about quite a lot. Data ethics might be individuals or organisations operationalising that sense of what is appropriate, whereas data governance is taking that broader view of how you shape the whole system to deliver good outcomes.

Q133       Chair: We have seen an increase in the prominence of data ethics in the fields of statistics and analysis over recent years. Why is that?

Gavin Freeguard: Probably several reasons. We are starting to understand more how to use data. There has obviously been an explosion in the data that is available for us to use. We should not think that it is entirely new. We have been dealing with information for centuries, but the volume, the speed, and the quality of data that we now have available means that all questions about data are much more prominent in political and public discussion. We also have a much better understanding of some of the harms that can come from the misuse of data, which is another reason why ethics and similar discussions are now at the forefront.

I know the Committee has asked a few questions about privacy, which is an important one of those, but ethics also looks at things such as bias. For instance, if you are using data sets that are partial, whether that is because they exclude people altogether or because they are biased with having particular communities overrepresented, that can lead to poor outcomes. It is that general explosion of data and the sense that we do know the risks now to a greater extent.

Reema Patel: I agree with a lot of what Mr Freeguard has said and add to that that the context that we operate in has become more datafied. Datafication as a practice is notable, largely because of the accelerating development of technology that is pervasive. We can now collect more data in real time and that phenomenon, alongside the way in which data is gathered and collected, means that there is a challenge now for older, more established systems to consider their relationship with much more real time sources of data. It is a challenge for the UK system, but it is definitely something that the UK system can consider.

There is a case for quality as well as quantity and this is definitely an area where official statistics play a crucial role: the quality assurance processes that we have, the checks and balances that we have in place through the creation of official statistics. The UK is unparalleled in many ways, with the systems and structures that it has created for the creation of official statistics.

Some of these are interesting questions and challenges for us to grapple with, but we are quite well served in the UK with a thriving ecosystem and a network of independent organisations thinking about these questions.

One of the things I also wanted to bring into the discussion is this question of the use of data and access to data. For instance, you have research data that has been historically held by Government organisations for many years, starting from things like the census, but we have increasingly begun to see a recognition that openness of some data sets can realise collective and social benefit by being very creative about how we do that.

The example I give is Transport for London data and the impact that has had on the ecosystem. We now travel very freely and effectively across the transport system using real-time data, using those data sets, and that has required a step change in the way we think about that.

Related to that, there is a classic traditional approach that has been closed but we are increasingly moving towards sharing and then openness. It is a judgment challenge, and it is a challenge of ethics: how do we decide what the right approach is to the use of that data, so we can navigate that balance and strike that balance between privacy but also ensuring that we can unlock the benefits and the value of the data for us as a society?

Gavin Freeguard: If I can add one thing building on that and again I completely agree. All of that lends to us recognising that these decisions are political, not just technical. Therefore, having to think about the politics and the policy around all this also means that we should have ethics at the forefront of our thinking.

Q134       Tom Randall: Mr Freeguard, in your earlier answer you pointed out that there is no agreed definition of data ethics and that different bodies define data ethics in different terms. Do you think that is a problem?

Gavin Freeguard: I think it sometimes can be. If you think about someone, a civil servant for instance, who is approaching a data-related project and they are looking for guidance there is a risk that it could get quite confusing, not quite knowing where to look and finding there are a multiplicity of things out there. There is obviously a risk that you could find the definition that suits what you are trying to do rather than the thing that you should be thinking about.

On the other hand, there are benefits to the fact that we have quite a lot of guidance out there. There is that risk that it gets confusing and overwhelming, but things like the Data Ethics Framework, various things from the UK Statistics Authority, things within Government functions, so digital data and analysis both have ethics in their key capabilities. There are lots of things in the health system as well with the Caldicott Principles. Even the ODI and others outside Government have things such as the Data Ethics Canvas, which helps you work through where ethics might be applied to a data project. There is a need to get that balance right between being overwhelmed and with conflicting definitions versus being able to translate all those different resources into making it practically useful as you are facing problems.

Reema Patel: We have a lot to learn from other ethics disciplines and I like to think of things such as medical ethics or the safety frameworks that we see in engineering, for instance. In those contexts, recognising the importance of the types of decisions made, it is very clear what comprises ethics. There are clear codes of conduct. There are clear almost statutory guidelines and there is legislation. We do have the legislation, so we have the GDPR and data protection rules in place. It is important to acknowledge that and the connection between ethics and legislation.

The problem around these different definitions, however, is that there is a bit of a risk of an instrumentalisation of ethics, of organisations saying, “We have created our own ethics framework, so we are doing the ethical thing” rather than demonstrating good practice and standards and processes that have been collectively agreed by society. In answer to your question: it is a challenge for us. We have a different definition, and I think there is more work to do in this space to articulate clearly what data ethics looks like.

Having said that, data is a broad concept and different organisations are dealing with different types of data. I can see why we are in a position where different organisations have come up with different definitions of data ethics, but I still think there is something to be said for the importance of consistency, certainty, and clarity when it comes to data ethics.

Q135       Tom Randall: In your answers you have alluded I think to the National Statistician’s Data Ethics Advisory Committee’s work. Do you think that committee provides a robust framework for the consideration of ethical issues?

Gavin Freeguard: As part of a wider system, absolutely, because that meets quarterly and is quite a high level committee, which is critical, but we do have to see it as part of that wider ethics ecosystem. There is the Centre for Applied Data Ethics at the UKSA as well. As I think we have both touched on different sectors, different Government Departments, they will have slightly different needs because they are operating in different contexts. Seen as part of a wider system it is extremely useful, but we need to ensure that all those other things are there as well.

Going back to points that we have both made, on how you operationalise those ethicshaving those high level committees is important, having those high level frameworks and principles is important, but what does that actually look like for somebody faced with a decision to make about the ethical use of data, so that we can support people in that position more as well.

Reema Patel: In addition to these points, there is something interesting about the self-assessment toolkit that was developed as part of that framework. There are toolkits and practical approaches encouraged as part of that framework for statisticians to use. The one caveat I would add is that it is a self-assessment toolkit so there is a question to be asked about whether there are processes for increased accountability in that context.

Having said that, overall, as Mr Freeguard suggested, there is in general an important contribution and we should not understate the value of the checks and balances in the processes that we have.

Looking beyond the framework specifically, we also have the Office for Statistics Regulation and its work in developing accountability structures for the development of statistics. These are important and influential bodies operating in this space and for good examples of some of the work that the bodies have done recently, OSR undertook an independent review of the Ofqual algorithm, and I heard colleagues ask questions in the earlier session about AI. You can see that these bodies and organisations are thinking about data in the broadest possible sense, and I think that is an encouraging indication of a healthy and thriving data ethics ecosystem in the UK.

Q136       Tom Randall: You have pre-empted my next question because I was going to ask how effectively Government bodies consider ethical issues in the context of statistics research and analysis. You have given some good practice examples there. Are there any examples of good and bad practice that you can name?

Reema Patel: To build on my answer. It is an effective system in many ways. There is also scope for improvement because the ecosystem is changing rapidly. Just in the last two or three years, we have seen that the way data is produced, created, and collected has dramatically transformed. I am not just talking about Government data. I am talking about all types of data. We can take confidence in the fact that we have appropriate checks and balances. We have a regulatory body. We have the ONS, and we have organisations that are producing data effectively and we have measures of economic statistics, and we can see the justice system and the way it is working and operating. It is good but it could be better. That is my answer.

How could it be better? Well, the shift more towards increased accountability and less reliance on self-assessment is important, because of the benefits that come from collective thinking or collective shared thinking and reflection. I mentioned the data life cycle at the beginning, but some systematisation acknowledging that life cycle and bringing that into the way statistics and other types of data are used and created and formalising that as a step-by-step process would be an interesting thing to think about.

The other area where we certainly need to do more is about thinking not just about data but the use of data in the development of the AI and algorithmic system. The Ofqual algorithm definitely pointed to the need for us collectively as a research ecosystem to do more thinking about the uses of data for algorithmic purposes.

Gavin Freeguard: Reema has set me up nicely for some of the less positive examples there. Again, a general endorsement is that I think there is a lot of good work going on, and a particular shout out to the Office for Statistics Regulation, which has done some important work around this. It has started talking a lot recently about the concept of social licence as a way of building public acceptability and trust. If you do want to innovate with data and make the most of it you need to bring the public with you, understand what is acceptable and talk to them about what you are doing.

Some of the things that have not gone so welland I think we have heard all of these mentioned alreadyinclude, again, the Ofqual algorithm. That did not go well. We have also heard a brief mention of GPDPR, which is General Practice Data for Planning and Research, so an attempt to use patient data for those purposes, based on the greater use of health data during the pandemic. The problem was it was not well communicated, there was not the public engagement to help people understand what was being proposed and as a result I think 1.5 million people opted out of their data being used for those purposes, which has an impact on the data set.

On the AI Safety Summit as well, there were some diplomatic successes that came from that and a lot of hard work to pull together a summit at short notice. Again, it had a very tight focus on existential risks around artificial intelligence rather than thinking about the benefits and harms that we are already seeing. AI systems are already being deployed across the public and private sectors across the world. It was a very particular flavour of risk that that summit was focusing on, which again I think excluded lots of the groups that would have had something valuable to add.

This is also where we must think about the politics and impact of poor examples across the public sector and the effect that they may have on specific projects elsewhere across the public sector. As part of some work that we did at the Institute for Government on data sharing during the pandemic, we were speaking to some people who had been involved in some critical data shares and they felt they had done things really well. They had gone through their ethical frameworks, they had gone through their impact assessments, they had engaged the public and then all of a sudden they see a news storyI think in the Financial Timesabout Test and Trace perhaps being about to start sharing data with the police, despite having promised that they would not.

This team, which had nothing to do with that, were suddenly extremely concerned because they knew there would be a knock-on effect on the trust in their service. Therefore, we must think about general political approaches when we think about some of these issues of trust and ethics.

Q137       Lloyd Russell-Moyle: You mentioned about the opting out of some of the NHS data, some of the concerns with other data sharing. From an ethics point of view, do you think there are different levels of concern depending on whether the data is being used for commercial purposes or whether it is used for individual gain?

With the NHS data there was a lot of accusations, fear, maybe truth—I genuinely do not know—that this was going to be used by American pharmaceutical companies who would plough the data and use it for their profits. That seemed to be driving the opt-out, compared to people’s genuine acceptance usually that the NHS, internally for research purposes, will use your data to improve your personal health outcomes. Is there something around who the beneficiary of that is and when money is involved people become much more sceptical and less willing to co-operate?

Gavin Freeguard: Yes, in short. I think people as well often expect that there is more of their data being shared within Government and they are quite surprised when they discover it is not. All the polling and other participatory work bears out what you just said, which is that people tend to trust the NHS already with their data more than they trust other parts of Government and other sectors. They trust the use of their data more when it is for their personal care or for research and they trust it less when there might be commercial arrangements in there as well.

All of that leads back to the importance of telling people what you are planning to do, asking them what they think about that, and showing what the benefit could be to them and to society from the use of data. We can get very fixated on the risks, but we also need to talk to people about the benefits and help them make informed choices about what they think we should be doing.

Q138       Lloyd Russell-Moyle: There is a very live discussion in data sharing around HIV, for example. Where HIV traditionally has not been shared in your general medical data set, there is a wider discussion now that if you want to destigmatise it you need to treat it like a normal piece of health data. However, lots of people have the fear of what has happened in the past in terms of the stigma. What one do you get rid of first? How do you persuade people who are sceptical of what has happened to them in the past that for their greater benefit their data maybe needs to be shared? Is there a way or is it just you must do old fashioned pressing of palms and persuading people one-to-one?

Reema Patel: With the specific example that you are talking about it is important to recogniseand we have learned this from existing studies, so for instance Living With Data research funded by the Nuffield Foundation and also Ada Lovelace Institute research on biometricsthat different populations have experienced data collection practices very differently. As a result, they struggle to trust the way particular types of data or data sets are used.

I think there is something about the demonstration of the use of that data in trustworthy ways over a period of time. There is only so much that the ethics piece can do but there is a wider ecosystem challenge, which is to demonstrate that data about a person’s sexual orientation, data about a person’s clinical condition, data about a person’s ethnicity, or a disability, these important data sets will be used in ways to create and ensure that there is a return back to the individual or the community. That is a wider set of responsibilities.

It is also a data ethics question. It is an ethics of care towards the communities that are impacted and again there is something important about seeing that language of the ethics of care and the language around data ethics, certainly.

To come back to the point about data sharing that you mentioned and the attitude towards different types of actors and organisations using and accessing NHS data, we undertook a study at the Ada Lovelace Institute in partnership with an organisation called Understanding Patient Data. The report is called “Foundations of Fairness” for health data sharing. It found that, overall, members of the public felt very confident and comfortable with NHS data sharing. That also extended to third party users including research institutions and charities. However, they really wanted to see a return back to society and a return back to the public and the NHS and clear safeguards.

Some of the challenges around GPDPR could have been avoided by very clearly articulating what the parameters and the safeguards were and the clear limitations to use, access and control of that data.

I think there is a lot to learn from what we have already undertaken in the UK system. However, from looking at all the public deliberation work that we have conducted over the years, it is absolutely not the case that people don’t want their data to be shared. It is just that they want good governance and clear conditions under which that data are used and shared, and they want to see a return to people and society.

Gavin Freeguard: A final line on that. As part of the IFG project we did on data sharing during the pandemic, we spoke to people who had been involved in the open policy making around the Digital Economy Act, which is obviously quite an important piece of legislation in allowing data sharing across Government. They said that when they talked to the public they found that people are much more likely to say, “Yes, but” or perhaps, “Maybe” rather than saying no outright to what was being planned with their data.

I think people, in Government particularly, are nervous that when they start talking to the public about this they will be shot down instantly and will not be able to share the things that they want to do with it, but it ends up being a much more nuanced and positive conversation, as Reema says, especially if you are able to show the safeguards and value that accrues to people.

Q139       John Stevenson: You have talked a lot about the ethics ecosystem, you have discussed issues about definition, the institutions, legislation success and failures, but that is all about Britain. I am interested in how the UK compares in our ethics framework with other countries.

Gavin Freeguard: The UK was pretty early internationally in things like what was then called the Data Science Ethics Framework in 2016. There was a brief moment when making the UK the home of ethical use of data and AI seemed to be one of our major international selling points. I am not quite sure where we are with that now, to be honest.

We do have some great institutions that are internationally renowned, academically speaking, civil society and we have some interesting initiatives across Government, such as the Centre for Data Ethics and Innovation with its public tracker survey, understanding how people feel about these things. There are some strong things that we can point to.

That said, we currently have the Data Protection and Digital Information Bill working its way through Parliament and there is a lot of concern in civil society that on some of those useful things that came from GDPR, which can support the ethical use of datathings such as impact assessments and data rights for data subjects and othersthe Bill proposes doing away with them, or making it more difficult to make subject access requests, so to understand how an organisation is using your data. There is a risk that we could be going backwards on some of that.

Q140       John Stevenson: How do we compare at present? Favourably, do you think?

Gavin Freeguard: Not unfavourably.

Q141       John Stevenson: Is there any other country we should look to as an exemplar?

Gavin Freeguard: A very good question. It probably depends on particular subjects, so if we look at the discussions around AI at the moment, for example, the US, the executive order there, is pretty comprehensive on thinking about the different types of benefits and harms that might accrue from AI and how you think about that across the whole system, rather than focusing on just a part of it.

When it comes to Government use of data, Estonia and Ukraine are often referred to, Taiwan to a certain extent. Again, there are different cultural approaches and levels of acceptability of different types of data use, so it is quite context-specific, but those are some of the nations that we might look to for certain lessons.

Reema Patel: Context is key here. We discussed health data and one of the reasons the UK has a particular advantage with the ability to access that data fairly systematically is because we have a National Health Service. That has positioned us well to lead when it comes to the governance and use of health data in particular. That is a good example.

If we look at transport data, for instance, we have some key leading organisations and institutions particularly in London. The question for us is: how do we ensure that that is systematic across the UK, so those benefits work in combined authority regions? That is relevant to the levelling up agenda, where data is a key valuable resource as part of levelling up.

Gavin mentioned several countries: Estonia and Taiwan. The US is notably driving forward the agenda around tackling data inequalities in particular and thinking about data in ethics in a very interesting way, so thinking about who might be underrepresented or overlooked and the gaps in bias. The AI Safety Summit did indicate some of that role that the US aspires to play in leading the ecosystem. The UK has a tough challenge ahead to keep up with colleagues across the Atlantic, but I am sure it is a challenge that we can meet.

Q142       John Stevenson: You say that at present our international comparison is quite favourable?

Reema Patel: I would say so, yes. I am keeping to the tone I set at the very beginning that we have good structures in place, but we need to adapt and evolve very quickly. We need to move quickly to ensure that we continue to lead. As Gavin said, we definitely got off the mark quickly with the creation of initiatives such as the Centre for Data Ethics and Innovation, the Ada Lovelace Institute. That was created before Cambridge Analytica happened, so the UK had an independent data ethics institute before this entered mainstream public consciousness. The Alan Turing Institute and the Open Data Institute are other good examples. These are all UK-based organisations, so we have a solid ethics ecosystem. The question is: can we meet the challenges of the near term and the rapid pace in change in AI and data-driven systems?

Q143       Ronnie Cowan: To take it to the next stage, are the UK Government doing enough to ensure that data is not being used and abused to effect outcomes of elections?

Gavin Freeguard: A topical and timely question. I mentioned that there are real concerns around the Data Protection and Digital Information Bill, which is going through at the moment. It is not helpful when 150 pages of amendments are added with no chance for the Commons to scrutinise it, including, as you suggest, around the democratic engagement sections which, even by the Government’s own admission, would appear to allow a future Government—because this one says they would not use these powers—to change the direct marketing rules in the run-up to an election.

This particularly focuses on something called the soft opt-in, which basically assumes that if somebody has been in touch with you at any point, even if you have not explicitly consented to receiving that particular direct marketing information, you can assume that they would. I think there are some concerns around that part of the Bill.

Reema Patel: One of the big challenges with the use of data in the context of elections is the speed now at which information travels and data travels. It is very difficult to fact-check things that travel that quickly through WhatsApp groups and other networks and social media networks. I understand that you heard from colleagues at Full Fact in a prior hearing and I think that their views would be helpful in this regard.

Fact checking can go some way towards the process but the challenge here is very much that challenge of the changing environment we are now operating in. What this means for data ethics is an interesting question. My suggestion to the Committee is that data ethics is increasingly a responsibility that is a social responsibility that we all share and is not now just confined to the ethics.

Research or Government institutions are important, but the media has an important role to think about in terms of their own ethics of the use of statistics, the way they are conveyed and communicated and uncertainties around that, and individuals when they are reading things or passing things on and sharing and communicating anything now have an increasingly important role to play.

The question there leads us to a consideration of the skills that we have in our society and whether we have properly adapted our education system and training system for those skills. As an ordinary member of the public, how can I understand when I am transmitting or sharing information that has not been fact checked is an important question, as is how we can ensure that we have robust research statistics in Government. Both questions need resourcing, care and consideration.

Q144       Ronnie Cowan: This Committee has often scratched its head when trying to ensure that elected Members, Cabinet Members, behave in an ethical manner, so trying to define what ethics are and where the boundaries lie. If you could help us with that, we would be grateful.

Very briefly, do you believe there are bodies out there that the Government should reach out to, to help them ensure the elections going forward are fair?

Reema Patel: The Electoral Commission is the obvious organisation where the responsibility could sit. I see members might have different views on that, but there are regulatory organisations that exist.

Gavin Freeguard: The Electoral Commission example is an interesting one, because is it properly resourced to be able to tackle these challenges? It is a live question and one that we see across regulators in this space generally. As we have both said this is a quite fast-moving world. How do we ensure that they are properly resourced and have the right skills?

More generallyFull Fact has been mentioned—there are academics in the UK and elsewhere thinking quite a lot about mis- and disinformation. There are lots of campaign groups and other civil society organisations. We are all here and willing to talk and listen and willing to suggest ways forward.

Q145       John McDonnell: You have covered to a certain extent some of this about community engagement. We have had several submissions that have recommended greater community engagement between Government officials and citizens. There have been some concerns about the existing engagement and how it could be discriminatory in some respects. For the record, could you explain why you think it is important to have community engagement and what should it look like? What would it look like if it were to be effective? Ms Patel, you have expressed your views on this as well.

Reema Patel: Yes, so why is engagement important? When we think about data we think about something that is complicated. It can be contested, as we have just explored. It can have different impacts on different communities, its use and management. It also requires community engagement and buy-in to get it right. The conversation that we just had about HIV in particular relies upon communities for its use and effective management. Data comes from people and that can be overlooked.

In terms of good approaches to engagement, one of the things that I led on when I was at the Ada Lovelace Institute was a report called “Participatory Data Stewardship”. The report set out a spectrum of approaches taken with engagement in the community. To start with, it is important that we get our communication right and that we inform people so that they understand what it is that they are giving their data over for, for what purpose, and they feel confident, and they want to give their data. That is important.

There are other approaches. There is consultation, which is the traditional space that we have often operated in. Then there are also some innovative approaches that data systems and processes are using, using citizen juries, deliberative public engagement processes that engage people on quite complicated topics.

I was involved in developing a process at Ipsos working with colleagues in an NHS trust to engage members of the public on the use of a waiting list prioritisation tool. The discussion there was about the different factors that might help prioritise an NHS waiting list. It is a good practical example of how, when you can involve people, over a period of time you get some interesting, rich and nuanced insights into what kind of data sets people feel comfortable with the use of and less so. It was a complicated and controversial area to work in, given the complexity of the way waiting lists are managed. That is a good example.

Interestingly, there are initiatives, such as the Liverpool Civic Data Co-operative, which are piloting different ways of more directly involving people in the governance of the data systems. When you look at the way the data systems are developed and designed there are clear touchpoints at which people are sense checked about the data lifecycle, the use, purpose and analysis of different data systems.

There is a wide range of approaches that can be used for community engagement. The challenge is that this is very often under resourced and there is clearly scope to invest in the sociotechnical side of the development of data systems. If we want to do this well we need to recognise that it takes time, effort and engagement in communities and we need to recognise the contribution of the communities and we need to invest in it.

Q146       John McDonnell: I do not want you to betray any confidences, but you are involved in the Bank of England’s development of the CBDC initiative, which has been heavily criticised by a Big Brother Watch report this week with regard to a potential threat to privacy. Has there been much public engagement in that process so far?

Reema Patel: Which project?

Q147       John McDonnell: The Treasury and the Bank of England’s Central Bank Digital Currency initiative.

Reema Patel: Extremely well-resourced. The Bank of England has some interesting regional citizen engagement panels and the practice of the Bank of England to engage through regional citizen engagement panels is an interesting example of the type of engagement I was talking about. There was a report on building a public culture of economics and the Chief Economist at the time accepted the recommendations from the report and implemented that in establishing regional citizen engagement panels on different facets of economic policy. Those panels run and they are a useful way for the Bank of England to understand how people experience the economy in different ways.

The Bank of England has an infrastructure that is quite noteworthy and is worth learning from. Could the Bank of England do better? The Bank of England could certainly do better. I am not saying it is perfect, but it is important to note that that type of institutional infrastructure is the type of institutional infrastructure that is worth considering or thinking about when it comes to the issues and the data itself.

Q148       John McDonnell: Was it mobilised for this consultation with regard to the CBDC?

Reema Patel: I understand the consultation and engagement is currently taking place, but it has not yet been mobilised.

Gavin Freeguard: Reema has covered the community aspect extremely comprehensively. I have a few things to add. One of the reasons why we need to move beyond just consultations and think about engaging communities and groups as well as individuals is that so much of the power of data comes from it being in the aggregate and in the collective.

Those old tropes around individual consent do not quite hold in the same way, because so much of our data is relational. The value comes from being able to bring it together, identify different groups and understand things at a much higher society level. Therefore, we need mechanisms that allow us to think at that level, rather than just the individual.

Broadly speaking, it is good for building trust. It is also good for better outcomes. If you can empower communities who know more about their data than anyone else and bring them into the process we could end up having much more value from it. There are different methods that you can use, which go from user research all the way to big bang citizen assemblies and deliberative processes, each have their place in different parts of different processes.

Q149       Damien Moore: Good morning. Do you think the public sector bodies are sufficiently transparent about how they use personal data and the production of analysis and where is it done well and where less so?

Reema Patel: It is a tricky question, because public sector bodies cover such a wide range of organisations and contexts. By nature of the type of data we are talking about, which is personal data, we rarely see an articulation of the use of that data. This was one of the big critiques during the pandemic in particular that Government organisations needed to respond to and adapt to. At first they did not necessarily always see the rationale and the reasons or the trail that led to the decision-making and over time committees and organisations shifted the way they communicated about that. A lot of confusion could have been avoided early on had those processes been in place.

Transparency is an interesting question. There is transparency in many ways. There is transparency about the data itself, but there is also transparency about how the data is being used to inform a particular policy or a particular decision. Often you may see that public sector organisations are clear about what data they have, but there is not always a very clear route from the data that they have through to the rationale or the reasons for decision-making, particularly in very pressured moments at times such as the pandemic. That can contribute to high levels of distrust and lower levels of public confidence in data-driven decision-making.

Gavin Freeguard: We could go further. There is still a lot we do not know and, therefore, I suspect that Government does not know about how personal data particularly, but data more generally is used in various parts of Government, where the big data sharing agreements are within Government and the public sector, but also with the private sector. That is not just important for trust and transparency. It is important for us to understand what works, what is not working so well, where are the real challenges in that. It is an effectiveness point as well as an ethical one.

That is particularly acute when it comes to some of the evidence for policy making. I know Sense about Science and the Alliance for Useful Evidence with the IFG and others in the past have tried to explore evidence transparency around policies, which is quite challenging to define but a worthwhile thing to do.

We have also seen a general retrenchment from earlier open data and open Government commitments over the last decade or so. We have seen some interesting initiatives such as the Algorithmic Transparency Recording Standard, which tried to pilot what transparency around the use of automated decision-making might look like. That seems to have stalled a little bit, although I think it may be about to get revived. I think we could go further.

There are some other good examples, so the National Data Guardian, who is the watchdog within the health system for instance plays a very valuable role in trying to surface some of these issues and trying to force greater transparency around it.

Q150       Damien Moore: You mentioned the pandemic and we have the inquiry going on today. I think today’s witness was at the helm throughout that pandemic. Do you think there is a case for doing things differently in looking at data in emergencies or pandemics than generally and setting out why and when we do that right at the start?

Gavin Freeguard: Yes, in some respects, but also in others. When I was speaking to people as part of various projects over the last few years, around pandemic data sharing, their sense was that quite a lot of the infrastructure was already there. GDPR largely did what they needed it to do. What changed was having that clear purpose. Those clear incentives around why data needed to be shared and used suddenly unlocked problems and overcame barriers that we have seen for years if not decades.

That perhaps has not quite continued through since. I know there is some concern in Government around that, because so many of the barriers to data sharing now, yes, there are some legal barriers still although we have overcome quite a few of those, there are definitely still technical barriers around data quality, different systems, legacy systems that it is difficult to get things from. However, many of those barriers are cultural and organisational. The Information Commissioner earlier used the example of organisation A has the data; organisation B will get all the benefit from using the data. Those incentives are a problem inside Government.

Chair: I am glad to say that concludes our questioning today. Thank you both very much for coming here today and for answering our questions so fully and so comprehensively.