Implementation of YAML-based models for PGx and PRS genomic analysis

Monika Opalek, PhD

Application of long-read sequencing to improve genotyping of complex pharmacogenetic regions

Pharmacogenomics (PGx) is a critical field in personalized medicine, focusing on how an individual's genetic makeup influences their response to drugs. By analyzing genetic variants that affect drug metabolism, efficacy, and toxicity, PGx can guide tailored treatments, ensuring optimal drug selection and dosing. This personalized approach aims to maximize therapeutic efficacy while minimizing adverse effects. 

Many pharmacogenes exhibit high polymorphism, including single-nucleotide polymorphisms (SNV), small insertions/deletions (INDEL), and larger structural variants (SV) like multiplications (DUP), deletions (DEL), tandem rearrangements, and hybridizations (for example CYP2D6 and CYP2D7 pseudogene). Due to ambiguities arising from sequence similarity, traditional short-read sequencing technologies often struggle to accurately map these regions, resulting in low mapping quality that can lead to incorrect genotyping. To address these challenges, we explored the application of long-read sequencing technologies: Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (Nanopore), enabling more precise mapping of reads in complex genomic regions. 

Polygenic tool

Our proprietary Intelliseq’s Polygenic tool is a key component of the PGx workflow. It identifies the most likely diplotypes based on individual star-allele prediction models for pharmacogenes. It uses phased and unphased information about SNVs and SVs in the form of VCF in combination with YAML-based models for ~30 pharmacogenes. As an output, it provides detailed information about possible and most probable diplotypes, together with information about missing and nonmissing genotype information as a structured json. The final information is aggregated and processed to estimate individual rates of drug metabolism.

You can learn more details by viewing the full version of the PGx poster.

This work was supported by PGx Plus project No. POIR.01.02.00-00-0089/18-00.

Development of WGS pipelines for computation and reporting of polygenic risk scores (PRS)

Polygenic Risk Scores (PRS) are a powerful tool in modern genetics, measuring an individual's genetic predisposition to certain traits, including diseases. By analyzing the collective effect of many genetic variants, PRS can estimate the likelihood of developing complex conditions such as heart disease, diabetes, or cancer. According to current knowledge, the analysis of polygenic risk scores (PRS) can predict disease risk as accurately as the analysis of the presence of pathogenic variants (Khera et al. 2018).

Polygenic tool 

The Polygenic tool is also a key component of the PRS analysis, again using phased or unphased information about SNVs obtained from a gVCF file. The phenotype is estimated based on genotype and polygenic risk score models, applying the individual YAML models for each trait. Other important steps in the workflow include phasing and imputing procedures. 

A detailed description of the PGx and PRS workflow steps and their schematic representation can be found on the corresponding posters. View the full pdf version of the PRS poster.

This work was supported by Mobigen project No RPMP.01.02.01-12-0049/18.

<h2>Want to know more?</h2>

Want to know more?

Get in touch with us.