Workflows for Whole Genome Sequencing (WGS) and Whole Exome Sequencing (WES) data require specification of the genes to be analysed. This step is essential to perform a genomic analysis that will answer a specific question about the patient’s phenotype. We use the HPO database to identify gene candidates for analysis, which are later merged into the gene panel. Human Phenotype Ontology (HPO) is a database that provides systematised categories of human phenotypic abnormalities together with associated genes. HPO aggregates data from the OMIM (Online Mendelian Inheritance in Man)  and the Orphanet  databases. Each human phenotype in the HPO database has its own unique identifier, called the HPO term.
In our workflows you can fill in a few fields to add genes to the analysis. Filling in at least one option is mandatory, however, the more information you provide, the more personalised the analysis will be.
The text description of patients symptoms is processed with a natural language processing algorithm. The identified keywords are transformed into HPO terms. The identified HPO terms are later transformed into diseases and genes using a fuzzy logic algorithm. HPO terms and all associated genes are added to the gene panel.
Disease – HPO (genes)
You can add diseases that you suspect may be associated with the patient’s phenotype and/or symptoms. The genes added to the gene panel are taken from the HPO database.
Ready-to-use gene panels
This option allows you to use a predefined set of genes that have been shown to be associated with a particular human disorder. These gene panels are based on the Genomics England PanelApp resources , which provide data collected during the 100,000 Genomes Project. There are currently 340 panels in the database, but within the IntelliseqFlow we use broader categories (called “Level 2” or “Panel Type” in the Genomics England PanelApp) so that you can choose from around 20 panels. Our categorisation corresponds to the panels in Genomics England PanelApp, so for example, if you select “Ciliopathies” on IntelliseqFlow platform, the genes added to the panel would be all of the unique genes from the 8 panels identified in the Genomics England PanelApp. The Rare Disease panel is the largest, consisting of ~5,000 genes, encompassing all panels in the Genomics England PanelApp with Rare Disease as the Panel Type. In addition, the ACMG Incidental Findings gathers genes that ACMG recommends to report regardless of the purpose of the analysis. These are genes that can have an impact on the patient’s phenotype even if they are not directly connected with their disease [4, 5].
Manually added genes
This option allows the user to specify genes of interest that may not be included in the predefined gene panel for a given disorder. To add these, simply list the specific genes, e.g. HTT, FBN1.
You can list HPO terms to add genes associated with the phenotype of interest. Use the same format of the terms as in the database, i.e. HP:0004942.
To further customise the analysis, we score each gene in the gene panel for its strength of association with the patient phenotype (phenotype match). For each HPO term and/or phenotype description provided by a user, the algorithm traverses the phenotype ontology tree searching for a match with each gene and assigning a score based on the distance between the original term and the term that matches the gene. For specific genes provided by a user, the phenotype match is assigned a score of 75%, for the diseases, the phenotype match is assigned a score of 50% and for each gene from the gene panel, the phenotype match is assigned a score of 30%. If a given gene is found in more than one category, only the highest score is considered. Genes with the lowest score (less than 25%) are excluded from the analysis.