Published: 18-08-2016 08:00 | Updated: 19-08-2016 10:54

Deep genetic catalog powers studies of disease and dna variation

New research, by members of the Exome Aggregation Consortium (ExAC), report scientific findings from data on the exome sequences - protein-coding portions of the genome - from 60,706 people from diverse ethnic backgrounds. The study is published in the journal Nature.

The research has been led by scientists at the Broad Institute of MIT and Harvard, with contributions from many researchers from different countries and institutions, among them Karolinska Institutet. Patrick Sullivan Photo: Ulf Sirborn

The ExAC dataset contains over ten million DNA variants and is a freely available, high-resolution catalog of human genetic variation that has already made a major impact on clinical research and diagnosis of rare genetic diseases.

“ExAC is particularly important for Sweden,” said Patrick Sullivan, a Professor at Karolinska Institutet. “Around one-fifth of the samples in ExAC are from Swedish people. This means that ExAC will be very important for determining the importance of genetic variants discovered in Swedish patients. This could be extremely important for people with cancer or for some seriously ill newborn infants.”

The catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence.

Objective metrics

The researchers have used it to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72 per cent of these genes having no currently established human disease phenotype.

They demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human ‘knockout’ variants in protein-coding genes.


Analysis of protein-coding genetic variation in 60,706 humans
Monkol Lek, Konrad J. Karczewski, Eric V. Minikel et al.
Nature, published online 17 August 2016, doi: 10.1038/nature