Data access in the NGRL
Certain staff within Genomics England will be able to see both identifying and de-identified patient data, enabling them to ensure that the process by which researchers can access de-identified data is securely controlled, and that information found by researchers can be fed back to healthcare professionals if relevant for their patient.
Researchers that can access the NGRL could be from UK or other countries. They may work within the NHS, hospitals, universities or charities. They can also include private healthcare groups or commercial companies, such as pharmaceutical companies, that use data to understand how current medicines could be improved or how new drugs or tests could be developed to help patients.
All researchers are vetted by an oversight committee before being granted permission to access the data, which is held in a form that cannot be copied or removed without the permission of Genomics England, and may only be used for purposes that are in line with their acceptable use policy.
There is a robust governance process in place within Genomics England for accessing data in the NGRL (see graphic, ‘How data access is managed’ – click to enlarge).
Researchers may be from universities and the NHS, or from industry (companies developing diagnostic tests and treatments). Any researcher must be approved in order to collaborate with Genomics England and other partners, including feeding back about how data collection and analysis may be improved to support research.
Researchers’ identities are checked and confirmed, and they must submit a research proposal detailing how and why they want to use the data for health purposes. This is reviewed by research oversight committees, which review the science, ethics and clinical benefits of applications for research. These committees include representation from NHSE/I, Genomics England, the NHS and patients. Once approval is granted, read-only access to the data is provided via access to a secure server, and activity within this environment is monitored. This acts like a reference library, meaning researchers cannot copy or remove information in the NGRL.
Instead, researchers work with the data using various tools available within the research environment. Once they have analysed the data they need, this can be taken out of the environment following an approved ‘air lock’ process, which manages data requests to approve the transfer of analysed data out of the NGRL (such as for publication), as well as findings that can be added back into the NGRL to contribute to ongoing research. In general, only summary-level (rather than individual-level) de-identified data can be taken out of the research environment.
Data cannot be accessed by groups such as non-health related government agencies, including those linked to employment, insurance companies, police and border agencies, or for marketing purposes.
De-identifying data makes it very difficult for any researcher to identify individual patients, and they are made aware that doing so is illegal. Identification is not impossible because of the unique nature of each individual’s genomic code; for example, if a patient is already part of a patient group for a rare disorder and has granted access to genomic data via other research, those researchers that also have approved access to the NGRL could identify that individual.