The Cohort Discovery Service helps researchers explore whether relevant patient cohorts exist across multiple datasets before submitting a formal data access request.

This page answers common questions about who can use the service, what datasets can be searched, how access to the service works, privacy and security protections, and what to expect when using Cohort Discovery.

Getting started

  • is a secure service that helps researchers quickly see whether relevant patient groups exist across multiple datasetsdistributed in different secure data environments across the UK, while helping data custodians receive more targeted data access requests. It removes uncertainty early and speeds up the path from research idea to responsible data access. 

  • Approved researchers can use Cohort Discovery to:

    • Explore whether suitablepatient groupsexist for a research study or clinical trial
    • Estimate cohort sizes across multiple datasetsmatching specific criteria
    • Identifyrelevant datasets and Data Custodiansto contact
    • Support early-stage feasibility planningleading tomore relevantdata access requests

    The service is designed for discovery and feasibility assessment only.It does not provide access to individual-level data or enableanalysis on cohort results within the platform.

    Once a suitable cohort has beenidentified, researchers can start the data access request process through the

  • Access is available to approved researchers, NHS analysts, and other authorised datausers working on projects intended to deliver public benefit.Applications are reviewed using a governance approach aligned to the Five Safes Framework.

  • You can request access directly through the Gateway by creating a registered user profileand thencompleting a short application. We aim to review all applicationswithin 1–2 working days.

  • Applications are assessed using the information provided in your Gateway profile, particularly your institutional email address and supporting credentials (e.g. an ORCID record). We may contact individuals foradditionalinformation if further clarification is needed.

  • No. Cohort Discovery supports feasibility assessment only. Any future access to individualdata still requires formal approvals from the relevant Data Custodian(s).

  • Cohort Discovery is available through the Health Data Research Gateway. The service is developed and maintained by 51 in collaboration with participating data custodians and partners across the UK research ecosystem.

Using the service

  • , but the datasets specifically available to you depend on:

    • Your organisation type (e.g. academic or industry)
    • Your approved access permissions
    • Which Data Custodiansparticipatein Cohort Discovery
    • Whether additional approvals are required (for example, the NHS Research SDE Network datasets)

    Visibility and access to datasets varies depending on your approved permissions and eligibility – this is determinedprimarily by geographic location (for example, whether you are accessing from outside the UK) and your affiliation with industry or academia.

  • Researchers use the Cohort Discovery Service to safely identify relevant patient cohorts without accessing patient-level data.

    Queries are built using an intuitive query builder and run securely in real time across multiple pseudonymised datasets. Each query runs locally within the data custodian’s secure environment on a de-identified subset of the data, never on identifiable patient records.

    Data is harmonised using the OMOP Common Data Model, allowing queries to run comparably across multiple datasets, while outbound-only connections help maintain network security.

  • The Cohort Discovery Service only returns aggregated, privacy-protected results.

    Researchers receive a rounded count of how many records match their search criteria. No individual-level data is ever shared or leaves the secure environment.

    To protect confidentiality, datasets are pseudonymised, low-number results are suppressed, and all data remains within the data custodian’s infrastructure at all times.

  • Certain datasets have additionalgovernance or access requirements set by the Data Custodian or the NHS Research SDE Network.This ensures data is accessed responsibly and in line with national policies and public expectations.

  • The NHS Research Secure Data Environment (SDE) Network provides secure, consistent, and efficient access to NHS data for research that improves health and care. These datasets available to search through the Cohort Discovery Service requireadditionalregistration and approval by the NHS Research SDE network.

  • Once you identify a suitable cohort, you cancontact the Data Custodian for further discussionand begin adataaccessrequest through the Gateway.

Privacy, governance and security

  • No. Researchers cannot access patient data through Cohort Discovery.

    The service uses a federated model, meaning queries are run securely within each data custodian’s secure environment using pseudonymised data. The data never moves or leavesthe host organisation.

    Cohort Discovery only returns aggregate counts, that are rounded and have low numbers suppressed to help protect privacy, showing how many patient records match the search criteria. No identifiable or individual-level data is ever visible or transferred.

  • Cohort Discovery has been designed to protect patient confidentiality and privacy at every stage.

    • Researchers do not see patient-level data
    • The service uses a federated model, meaning queries are run securely within each data custodian’s secure environment using pseudonymised data.The data never moves or leavesthe host organisation
    • Results are returned as rounded, non-identifiable counts
    • Low counts are suppressed to reduce any risk of reidentification
    • Access is permission-based and governed
  • has identifyinginformation, likenames, addresses, exact dates of birth, etc.,removed or replaced with coded identifiers so individuals cannot be directlyidentified.

  • The Cohort Discovery Service follows the to ensure safe and secure access to information about the available data across a range of secure environments. This is a set of principles which enable data services to provide safe research access to data. The framework originated from the ONS and was developed by them and other data providers in the 2010s.The framework has become best practice in data protection whilst fulfilling the demands of open science and transparency.

  • No. Cohort Discovery includes safeguards to reduce the risk of re-identification, including data pseudonymisation, aggregated, rounded counts, and minimumcohort thresholds whereappropriate.

What Cohort Discovery is NOT

The Cohort Discovery Service does not provide access to patient records or identifiable information. Users cannot download data or conduct patient-level analysis through the service.

Instead, Cohort Discovery helps researchers assess whether suitable patient populations may exist for approved research purposes. Any subsequent access to data remains subject to separate governance, approvals, and controls managed by the host organisation.

Working with us

The Cohort Discovery Service is opensourceand can be adapted for use by other organisations, research consortia, or national programmes. It can be used as an internal feasibility tool within a Secure Data Environment or implemented more broadly as a shared service.

The code is available on our, and if you would like to discuss the service or potential co-development opportunities, please contact us.

Find the right patient cohorts with Cohort Discovery

Providing a faster, more efficient starting point for planning research before submitting a data access request.