Overview

Cancer and cardiovascular diseases are major killers worldwide. Despite progress in prevention and treatment, their impact continues to grow as populations age.ÌýÌý

To tackle this, the Big Data for Complex Disease Driver Programme aims to use large health datasets to improve early detection, diagnosis and management of both conditions. By analysing data from various sources, the program seeks to enhance patient care and reduce the burden of these diseases on individuals and communities. Ìý

This initiative leverages comprehensive health information to inform policies and best practices, ultimately aiming to lower disease impact globally.Ìý

“This is an unprecedented opportunity to bring together the best minds in the UK to address the two Big ‘C’s’ of human health, cancer and cardiovascular disease, which collectively kill over 320,000 people in the UK each year. We will deploy a new approach, underpinned by the smart use of data, to provide a better insight into the key drivers of these diseases and use these insights to transform the lives of our patients and citizens.â€

 

Mark Lawler, Associate Director of Health Data Research Wales-Northern Ireland and Scientific Director of DATA-CAN

View the Programme’s Data Summary Dashboard

  • We are working to understand:

    • What signals are present in population-level health data that indicate a higher risk of cancer, CVD, and other complex diseases?
    • What are the inter-relationships amongst cancers, CVD, and other complex diseases?
    • What are the drivers for the impacts of inequalities on the development, diagnosis, and treatment of cancers, CVD, and other complex diseases?​
    • How can this new knowledge be translated into real benefit for the public and patients and to influence national and international policy and best practice?
    • Create streamlined data access processes across the 4 national Secure Data Environments (SDE) and ensure clear information is available regarding dataset contents and suitability for projects.
    • Improve dataset linkages, access to code and code lists, phenotype libraries and algorithms, including creating pipelines and common ‘code lists’ to help define disease phenotypes and their diagnoses.
    • Define the components of our risk prediction scaffold, including the data (disease phenotypes, variables), models and algorithms used, and the evaluation metrics and validations.
    • Establish networks with existing risk prediction research programmes, e.g., Cancer Data Driven Detection (CD3).
    • Identify and begin to measure biological effects and potential interactions of drugs for treatment of cardiovascular diseases, cancers on other conditions.
    • Health economic research on the impact of co-morbidities when related to treatment and survivorship of cardiovascular diseases and cancers.
    • Improve UK-wide standards in data definitions of characteristics of inequalities and develop projects to bring together health and related datasets to understand their impact on complex diseases.
    • Provide training and support to PhD students and fellows.Ìý
    • Development of data curation tools to support health data research in cardiovascular, cancer, and other diseases within the NHS England Secure Data Environment.Ìý
    • Publication of data-led consensus statements on the financial impact of cancer treatments, including policy recommendations for health systems.ÌýÌý
    • Publication of a position paper to influence UK government to adopt a national cancer plan.ÌýÌý

How we incorporate Patient and Public Involvement and Engagement (PPIE)

Our aim is for research within this programme to be co-created with an involved, engaged, and empowered patient voice. By collaborating closely with patients and the public we will work to ensure that the principles of trust and responsible data use underpin all aspects of the programme.

Find out more about our PPIE work

  • Developed in collaboration with our public contributors, researchers and key stakeholders we have outlined the following objectives for our PPIE Strategy:

    1. Embedding PPIE across all areas of the programme to strengthen the research, improving quality, relevance and impact and ensure data is used in patient’s best interest.

    2. Engaging and involving diverse audiences through varied and innovative methods, ensuring an inclusive and accessible approach. Using a programme wide trauma informed approach that ensures emotional safety for all those involved.

    3. Taking a collaborative approach to PPIE, sharing knowledge, insights and learnings within and outside the programme. Working in partnership to streamline and strengthen our work.

    4. Creating long-term impact through dissemination methods that consider the public, healthcare professionals, and policy makers to ensure the communication of research outcomes, especially those focused on risk information has positive real-world impact.

  • Inequalities Cross-Driver workshop

    In March 2025, Health Data Research UK (51±¬ÁÏÍø) hosted an Inequalities Cross-Driver Workshop to discuss the positive impact data research can have on tackling inequalities. The two-day workshop brought together researchers from 51±¬ÁÏ꿉۪s Driver Programmes and Regional Networks alongside funders, policy makers, data experts and importantly public members, to map policy priorities and identify collaboration opportunities. It was crucial that we did this work in collaboration with public members who are passionate about tackling inequalities experienced within their community.

    , including thoughts and reflections from three of our public contributor’s in attendance.

    Public perspectives on transparency in clinical risk prediction tools

    This case study highlights a public engagement activity where researcher Stelios Boulitsakis Logothetis explored views on the transparency of risk prediction tools and their potential use in clinical decision-making. Presented at Use My Data’s inaugural National Patient Data Day, the session invited discussion on the use of statistical models vs machine learning approaches – raising questions about interpretability, trust and understanding. This case study illustrates how thoughtful public dialogue can inform the development and use of predictive tools in health data research.

    Read the case study

    Longitudinal Data Modelling Symposium

    The Big Data for Complex Disease consortium recently hosted an online symposium to bring together researchers from across 51±¬ÁÏ꿉۪s Driver Programmes and wider Institute who are interested in using health data modelling to understand risk and disease trajectories, better informing prediction, prevention, and treatment across the population. The symposium was an exciting opportunity to hear from researchers working with novel methodologies and tackling challenges in working with health data across a variety of diseases, co-morbidities, and types of EHRs.

    The symposium featured two excellent keynote talks fromÌý,ÌýAssociate Professor in AI for Digital Health in the Department of Engineering Science at Oxford andÌý,ÌýProfessor of Population Health and Statistics at the UCL Social Research Institute. It also featured a panel discussion on the challenges of creating prediction models that deliver benefit for patients, a virtual poster session and short talks selected from submitted abstracts.

    You canÌý.

Read about our Privacy Statement

  • We are committed to being transparent about how data are accessed and used within the Programme. The privacy statement below explains the roles of Health Data Research UK and partner organisations in how data are accessed and used within secure data environments. For more detailed information, please read the full privacy statement by clicking the link below.

    BDCD Privacy StatementÌý