CASE STUDY

Identifying the wider determinants of health

Inform insights, by linking people to places.

Researchers at the Clinical Effectiveness Group, at Queen Mary University of London, are linking de-identified (or pseudonymised) data on 'people' and 'places,' to inform powerful health research and better location-based care.

|3 MIN READ
  • OS Open UPRN

Challenge

Patient health records held by GPs contain important information about a population's health for health providers, business intelligence, and research. Using de-identified patient addresses from these records as part of health research can inform ground-breaking insights into the wider determinants of health, by linking people to places.

However, addresses are entered into GP records as 'free text', meaning the same address can be written in different ways. If we can harness de-identified patient addresses this would enable health records to be analysed at a household level. At present we can only analyse 'place' by postcodes or small areas, both of which include multiple households which may not share the same characteristics.

Solution

A team of researchers, led by the Clinical Effectiveness Group at Queen Mary University of London, and Endeavour Health Charitable Trust, has developed an algorithm that leverages the power of Unique Property Reference Numbers (UPRNs) to analyse health data at household level.

UPRNs are numeric identifiers that are allocated to every property and managed in an Ordnance Survey database. The algorithm, known as ASSIGN (AddreSS MatchInG to Unique Property Reference Numbers), compares addresses in patient health records with Ordnance Survey's UPRN database, one element at a time, and decides whether there is a match. It mirrors human pattern recognition to allow for character swaps, spelling mistakes, and abbreviations.

The algorithm is open source and has been proven to be very accurate – correctly matching 98.6% of patient addresses at 38,000 records per minute. Importantly, the patient records and the UPRNs are de-identified which keeps addresses and patient identities hidden from researchers.

Result

ASSIGN has unlocked the potential of UPRNs for place-based health analysis and research. It is an open source, quality assured, and transparent algorithm available for use under a creative commons licence.

Assigning UPRNs to the addresses in health records enables two key things: linking people who share a household at a point in time to understand variations in household health, and linking to other data sources, such as property information and local authority records, to study other wider determinants of health. The algorithm makes bulk address-matching with UPRNs scalable and fast, using a rigorously tested and standardised method.

The Clinical Effectiveness Group has worked with the NHS in northeast London to assign UPRNs in real time to every patient address in GP health records. They are using the de-identified data, sometimes linked with other datasets, to investigate the health impacts of household overcrowding and household clustering of people affected by multiple long-term conditions.

The team is also working with the NHS in Wales and Scotland and with local authorities in London to leverage the benefits of ASSIGN to improve population health and inform policy.

"We can now link and analyse de-identified health data at household and small area levels. This opens up exciting new opportunities to understand health inequalities and the effectiveness of policies to reduce them. Using our ASSIGN algorithm and de-identified UPRNs, patient identities and addresses remain hidden, while our researchers and analysts can build a rich picture of the social and environmental factors that affect health at a population scale."

Carol Dezateux, Professor of Clinical Epidemiology and Health Data Science at Queen Mary University of London

The power of addressing

Get further information on the power of addressing data and the UPRN

Get the certainty you need

Products and solutions featured in this study

  • OS Open UPRN

    An open dataset enabling linking, sharing and visualisation of data related to UPRNs.