When it comes to privacy, many of us rely on the belief that it is a big world out there and the likelihood that someone will dredge up data, be that a password or a genome, and tie it to us personally is small.
Recent research has shown, however, that we cannot hide amongst the crowd. In fact, more than 60 percent of Americans with European ancestry are identifiable through their DNA, whether or not they have ever submitted a sample to be sequenced. This potential inability to remain genomically private stems from the fact that by early 2019 an estimated 26 million consumers had been sequenced by the four leading commercial consumer DNA companies. Large amounts of this information has ended up in open data genomics databases such as GEDmatch which allow for the construction and analysis of family trees containing millions of individuals. Combining genomic databases with other information available online increasingly means that the re-identification of individuals from notionally aggregated and de-identified data sets is both possible and technically not that challenging.
The choices we make in answering these questions will go a long way toward determining our success in creating a health system that effectively serves individuals while protecting our most personal information.