杏吧视频

Skip to main content
Genomics

Protecting genomic privacy

RESEARCH IMPACT | September 15, 2020
STORY BY: EDITORIAL STAFF

杏吧视频 computer and data sciences researcher improving privacy for global genomic data sharing network, supported by $1.2 million NIH grant

A 杏吧视频 computer and data sciences researcher is working to shore up privacy protections for people whose genomic information is stored in a vast global collection of vital, personal data.

photo of Erman Ayday
Erman Ayday

Erman Ayday, assistant professor of computer and data sciences at the Case School of Engineering, was recently awarded a four-year, $1.2 million grant from the to pursue novel methods for identifying and analyzing privacy vulnerabilities in

Personal genomic data refers to each person鈥檚 unique genome, his or her genetic makeup, information that can be gleaned from DNA analysis of a blood test, or saliva sample. 

but the terms refer to related, but different fields of study:

  • Genetics refers to the study of genes and the way that certain traits or conditions are passed down from one generation to another. It involves scientific studies of genes and their effects. Genes (units of heredity) carry the instructions for making proteins, which direct the activities of cells and functions of the body. 
  • Genomics is a more recent term that describes the study of all of a person's genes (the genome), including interactions of those genes with each other and with the person's environment. Genomics includes the scientific study of complex diseases such as heart disease, asthma, diabetes, and cancer because these diseases are typically caused more by a combination of genetic and environmental factors than by individual genes. 

Ayday plans to identify weaknesses in the beacons鈥 infrastructure, developing more complex algorithms to protect against people or organizations who share his ability to figure out one person鈥檚 identity or sensitive genomic information using publicly available information. 

Doing that will also protect the public鈥攑eople who voluntarily shared their genomic information to hospitals where they were treated鈥攚ith the understanding that their  identity or sensitive information would not be revealed.

鈥淲hile the shared use of genomics data is valuable to research, it is also potentially dangerous to the individual if their identity is revealed,鈥 Ayday said. 鈥淪omeone else knowing your genome is power鈥攑ower over you. And, generally, people aren鈥檛 really aware of this, but we鈥檙e starting to see how genomic data can be shared, abused.鈥

that 鈥渋f someone had access to your genome sequence鈥攅ither directly from your saliva or other tissues, or from a popular genomic information service鈥攖hey could check to see if you appear in a database of people with certain medical conditions, such as heart disease, lung cancer, or autism.鈥

Human genomic research

There has been an ever-growing cache of genomic information since the the 13-year-long endeavor to 鈥渄iscover all the estimated 20,000-25,000 human genes and make them accessible for further biological study鈥 as well as complete the DNA sequencing of 3 billion DNA subunits for research.

Popular genealogy sites such as Ancestry.com and 23andMe rely on this information--compared against their own accumulation of genetic information and analyzed by proprietary algorithms鈥攖o discern a person鈥檚 ancestry, for example.

Ayday said companies, government organizations and others are also tapping into genomic data. 鈥淭he military can check the genome of recruits, insurance companies can check whether someone has a predisposition to a certain disease,鈥 he said. 鈥淭here are plenty of real life examples already.鈥

Scientists researching genomics are accessing shared DNA data considered critical to advance biomedical research. To access the shared data, researchers send digital requests (鈥渜ueries鈥) to certain beacons, each specializing in different genetic mutations.

What are 鈥榯he Beacons?鈥

The Beacon Network is an array of about 100 data repositories of human genome coding, coordinated by the (in collaboration with a). 

And while 鈥渜ueries do not return information about single individuals,鈥 according to the site, a scientific study in 2015 revealed that someone could infer the membership of a particular beacon by sending that site an excessive number of queries.

鈥淎nd then we used a more sophisticated algorithm and showed that you don鈥檛 need thousands of queries,鈥 Ayday said. 鈥淲e did it by sending less than 10.鈥

Then, in a follow-up study, Ayday and team showed that someone could also reconstruct the entire genome for an individual with the information from only a handful of queries.

鈥淭hat鈥檚 a big problem,鈥 Ayday said. 鈥淚nformation about one, single individual should not be that easily found out.鈥


For more information, contact Mike Scott at mike.scott@case.edu