Published: 24-03-2023 14:02 | Updated: 27-03-2023 10:41

Method development for analyzing omics data to study complex diseases

In a new thesis from Karolinska Institutet, the focus was on the use of multiomics data in the discovery of disease signatures.

DNA code and helix
DNA code, illustration downloaded from Pixabay, CC0. Photo: Gerd Altmann, Pixabay.

Lu Pan, a PhD student at the Department of Medical Epidemiology and Biostatistics is working with multiple omics data analysis, including genomics, transcriptomics and proteomics. Omics technologies can help to answer intricate biological questions, and analysing multiomics data provides opportunity to reveal novel information that would otherwise be hidden.

The focus in Lu’s work was on the development of single-cell transcriptomics tools to quantify isoform-level gene expression, as well as the discovery of omics signatures specific to normal brains, and amyotrophic lateral sclerosis (ALS) disease.

Many diseases are not caused by just one gene, but by a complex interplay between biological and environmental factors. For these complex diseases, pathology may involve many different molecular layers from the central dogma of molecular biology. This means looking at only a single layer is not sufficient to comprehensively understand the disease. To gain a better understanding and make novel discoveries, we need to combine evidence from multiple layers or angles together. Advanced omics technologies have made it possible to investigate these different layers at once, but analyzing and integrating multiomics data pose new challenges.

For example, Chromium Single Cell 3ʹ 10× Genomics technology, the most commonly used single-cell method allows examining RNA expression of up to 10,000 individual cells, but the sequencing data produced by this technology is strongly biased to the 3 prime of transcripts. One study of Lu Pan’s thesis focused on addressing the data analysis challenges caused by that issue. In her other studies, she focused on utilizing multiple omics data to discover omics signatures specific to normal brain regions as well as to ALS disease.

What are the most important conclusions in your thesis?

“My thesis came to three major conclusions. First, we developed a single-cell isoform quantification tool to address prime-bias problem introduced by the 10X Genomics. Next, we discovered isoform-ratio quantitative trait loci (irQTL) specific to various normal human brain regions. Last but not least, we identified protein biomarkers specific to ALS for disease diagnosis, prognosis, etc. These findings should help in addressing current analytical limitations in single-cell transcriptomics, and provide valuable resources and directions for further research or validations based on our current findings.”

Why did you choose to study this area?

“Various omics biotechnologies, especially single-cell omics, are enabling us to conceptualize explicit biological events at unprecedented dimensions and resolutions. New challenges and opportunities have also surfaced, which provided us with ideas on how to tackle existing problems introduced by these new technologies, and at the same time, to better utilize them to enhance research power.”

What do you think should be done moving forward?

“Our quantification tool can be further expanded to apply onto other single-cell droplet-based technologies, and not limited to 10X Genomics. The signatures discovered from normal brain regions and for ALS could be further validated for their signature-specificities, as well as their potential applications in clinical settings.”

Doctoral thesis

Integrative omics data analysis to discover novel signatures in complex diseases.

Lu Pan. Karolinska Institutet (2023), ISBN: 978-91-8016-833-5