Are these medical records Anonymised ?, (birthday problem variant)

  • Context: Undergrad 
  • Thread starter Thread starter B0b-A
  • Start date Start date
  • Tags Tags
    Medical
Click For Summary

Discussion Overview

The discussion revolves around the anonymization of medical records in the UK, particularly focusing on the potential for identifying individuals based on their date of birth, postcode, and gender. It explores the implications of the "birthday problem" in this context, examining whether such data can truly remain anonymous.

Discussion Character

  • Exploratory
  • Debate/contested
  • Technical explanation

Main Points Raised

  • Some participants suggest that knowing a person's date of birth, postcode, and gender could lead to identifying specific individuals, referencing the "birthday problem" as a relevant analogy.
  • One participant notes that the odds of having duplicate birthdays among a group increase significantly, which complicates the anonymity of individuals within that group.
  • A participant cites security researcher Ross Anderson, claiming that with access to a birth date and a postcode, it is possible to identify about 98 percent of individuals, barring exceptions like twins or certain demographics.
  • Another participant argues that the discussion remains hypothetical without access to databases that provide current addresses and birth dates for a significant portion of the population, pointing out limitations of the electoral register and birth records.
  • Further, it is mentioned that while the register of births indicates where individuals were born, it does not provide current residence information, which is crucial for identification.
  • One participant elaborates on the potential for cross-referencing the register of births with the electoral register to narrow down identities based on shared names and birth dates within a postcode.

Areas of Agreement / Disagreement

Participants express differing views on the effectiveness of anonymization in medical records. While some argue that identification is highly probable with the given data, others emphasize the lack of accessible databases that would enable such identification, indicating an unresolved debate.

Contextual Notes

Limitations include the reliance on specific databases that may not be publicly available, as well as the assumptions regarding the availability of data for a large portion of the population. The discussion does not resolve the implications of these limitations.

B0b-A
Messages
155
Reaction score
32
Are these medical records Anonymised ?, (“birthday problem” variant)

Anonymised medical records are to be sold in the UK ... https://www.google.co.uk/search?q=sell+NHS+medical+records+harvest

Are these medical records truly Anonymised ?

If a person’s date of birth, (year-month-day) , postcode (zip code), and gender is known
what are the odds that data could identify a specific person ?.

At first glance it looks like there may be sufficient information to identify a specific individual,
by using databases like register-of-births and the electoral-register.

But the calculation involves something similar to the anti-intuitive “birthday problem”,
where a anonymity increases rapidly with the group size if you only knew someone's birthday, (not date-of-birth).
 
Last edited:
Physics news on Phys.org
In the birthday problem, the odds that there is at least one duplicate birthday between any 2 persons are much higher than you might expect, but here you want to have a duplicate for everyone.
 
wired.co.uk said:
As security researcher Ross Anderson points out, there are typically only a few dozen addresses in a post-code, so with access to a birth date (that may come from sources outside of HSCIC) it is fairly easy to make a correct personal identification for about 98 percent of people (the exceptions are twins, students, soldiers and prisoners).
http://www.wired.co.uk/news/archive/2014-02/04/care-data-nhs-healthcare
 
All this is hypothetical unless you can identify a database which does actually contain the current address and date-of-birth of a large proportion of UK residents.

Neither of the OP's suggestions do that. The register of births only shows where people were born, not where they currently live. The electoral register does not show the date of birth, the only age-related data is whether people are eligible to vote and eligible for jury service, and even that limited information may not be on the public copy of the database.

The national insurance number (similar to the US social security number) database, owned by the UK taxation authorities, would cover a large proportion of the population, but not children or people who have never been (legally) employed - but again that information is not publicly available.

Not that facts are of much interest to conspiracy theorists, of course!
 
AlephZero said:
... The register of births only shows where people were born, not where they currently live. The electoral register does not show the date of birth ...

If you know the date-of-birth and their current postcode then cross-referencing the register-of-births with the electoral-register could tell the names of people born on that date who have the same name as people who live in that postcode.

[ Apparently about 2000 births per day in the UK, and a comparable number of people living in a particular postcode]