De-identified data for research | 13 June 2014 | Meeting note

Friday 13th June, 1pm-14:45pm


Daniele Bega HMRC
Marcus Besley BIS (Go-Science)
Frances Pottier BIS
Ben Humberstone ONS
Johanna Hutchinson BIS
James Denman CLG
Jackie Riley HMRC
Maria Sigala ESRC
Amanda Smith Open Data Institute
Nicky Tarry DWP
Olivia Varley-Winter Royal Statistical Society
Tammy-Jo Johnson ONS
Iain Bourne ICO
Pete Lawrence Cabinet Office
Simon Whitworth UK Statistics Authority
Melanie Wright Administrative Data Service, University of Essex
Nagina Akram DWP
Sue Holloway Pro Bono Economics
Dan Nesbitt Big Brother Watch
Cindy Bell HMRC
Rufus Rottenberg Cabinet Office

Meeting note

  1. Sam Smith, who was not present, tabled a question.
For approved projects matching identifiable data from multiple data holding departments, where could that research be conducted? If the matching share was relying on the proposed legislation for its vires, then the answer is that that research would have to be conducted in an Accredited ADRC Safe Haven (However the title for this role is still be debated so the name for this role may change in due course).
ie a project using, for example, HMRC and DWP data, which both departments hold part in their own safe settings (so it’s not new data We would challenge this assumption. In a TTP share as data is always matched between two or more datasets, essentially the matched data is a new dataset because though its consitutuent parts are already known the combination of them in a de-identified form is new.
, and there are already processes in place for approval), would that project be possible? Yes, the project would be possible and it could adopt the existing standards as an additional part of the proposed TTP share under the new legislation. However because the nature of TTP shares are set up so that no one body ever has access to all the information involved, then if say HMRC and DWP were taking part then neither of them could be the Accredited ADRC Safe Haven   for that share (but they could potentially be so for some other share that they were not supplying the data for) So the HMRC safe setting could be accredited as an Accredited Safe Haven if HMRC was handling the linking of the DWP data to e.g. MOJ. 


Could there be cross-accreditation of settings, so projects approved by both boards could have data in the highest security setting, whichever department that is?I appreciate this isn’t a simple question that starts to cut across departmental silos.  We are not setting rigid security settings or share agreement terms as a part of the legislation, so each share can have whatever level of security it is deemed necessary for the particular data and circumstances of each case. Such shares can adopt a higher standard as suggested, where this would be necessary and proportionate. The accreditation would provide a certain minimum, but there is flexibility in each share to go beyond this as a part of the standard data share agreement negotiating process between the data controllers.
If you’re going to try and change the law, it would seem worthwhile to discuss things that might be useful that you can’t currently do and there is a clear need for. This new power fulfils that condition, because although sharing by two public authorities of de-identified data through a safe haven is permissible in some cases and by some bodies, not all bodies can use this process for all data/cases. Additionally how each body does it varies from body to body, and case to case.
Legislation that lets you do stuff you already do means you’re either acting illegally or it’s substantively pointless There is a need because, as set out above, public authorities cannot always do it. There is also substantive benefit in changing the law to provide a more transparent, more consistent method available to all public authorities.
  1. Cabinet Office has arranged an all day meeting at Admiralty House on 23rd July for the ONS strand and de-indentified strand together. It is hoped that, following this, policy strand documents establishing areas of agreement whilst identifying unresolved issues and setting out possible solutions can be agreed.
  1. HMRC explained their rationale for a new proposal for legislation that would provide specific powers for HMRC. The ambition is for these powers to be contained in separate clauses in a data sharing bill. The power would permit HMRC to make anonymised individual level information available for non-tax purposes. Currently research projects that require access to information held by HMRC must demonstrate a benefit to HMRC’s functions. HMRC would, using this specific power be able to share this information, for example with other government departments and researchers through the HMRC Datalab. There will be safeguards and it would be sensible to try to keep these in common with those being developed across the data sharing proposals.
  1. Discussion of the draft discussion document on the how a general power for two or more public authorities to share de-identified data for research purposes would work. ONLY if they wished to use this particular power, which would be additional to any existing powers, public authorities would have to use an Indexer approach, under which source public authorities would provide de-identified data to an “Accredited Trusted Third Party Safe Haven” [CO comment: despite what was said in the meeting, this seems a clearer way of describing an organisation when it is the beneficiary of a particular data share under the power] and provide identity data stripped of all payload information to the Indexer, thus allowing data to be linked while ensuring that when this power is used, no one party ever has access to all the identifiable and all of the payload data. The Firewall Centre model works similarly, but the Indexer and the Accredited Trusted Third Party Safe Haven are within the same legal entity but separated by firewalled sections. If public authorities already posses the vires, they will not need to use this power.
  2. Accreditation of TTP safe havens, accredited Indexers, accredited research projects and accredited researchers would be the responsibility of an overarching authority, specifically provided for in legislation. The document proposed certain minimum criteria for these. The glossary required improvement and group members were asked to propose definitions for consideration. There was a discussion around attempting to ensure consistency with ICO code of practice and concepts in the DPA. [CO comment: the purpose of this glossary is to illustrate effectively the circumstances covered by the proposed legislative provisions and this will take priority in defining the terms over and above any requirement for the definitions to match any other existing definitions]
  3. Attendees were asked to suggest which public authorities the power should not apply to. It was intended that NHS bodies should be excluded from this power.
  4. “Identifiers” would not be definitively listed.
  5. The words “Administrative Data Research Centre” should not appear in the legislation. Most data provided to the ADRC will do so under existing powers, and not using the new powers.
  6. The security accreditation model design should look at the entire system in the first instance. Separate accreditation for centres and researchers is appropriate but if they are not consistent and joined up mitigation against risks cannot be managed.
  7. There needs to be ongoing communication between the parts of the system to ensure they are working properly. This requirement will be for guidance. Clarity would be needed around disclosure control and reasonable likelihood of identification.
  8. ONS were concerned that any legislation should not “contradict” the Statistics and Registration Act. There would be discussion between ONS and CO lawyers.
  9. Clarification of terms such as aggregated, individual level, record level, de-identified, unidentified, anonymised was necessary.
  10. Holding of metadata, re-use and release processes should probably not be mandated in law as it already exists in the UK Statistics Authority code of practice.
  11. The requirement that research be “published” would be satisfied by the legal definition of publication – not peer review journal publication.
  12. Additional criteria for accreditation would be set by the relevant safe haven.
This entry was posted in Meeting notes on by .
Tim Hughes

About Tim Hughes

Tim is Involve's incoming director, taking over from 21st January 2017. Tim has led campaigns and advocacy on open government; advised national, devolved and local governments, civil society organisations and multilateral institutions; and researched and written on topics including public participation, open government, democratic reform, civil society advocacy and public administration.

One thought on “De-identified data for research | 13 June 2014 | Meeting note

  1. nylandrecords

    Whenever a person participates in genetics research the donation of a biological specimen often results in the creation of a large amount of personalized data. Such data is uniquely difficult to de-identify.


Leave a Reply

Your email address will not be published. Required fields are marked *