The results obtained following the described 3-step methodology applied to the 4 selected exemplar cases and discussed in phenotypic jamborees is summarized in Table 1.
Table 1
Step A1 was intended to understand if the solved case can receive the clinical diagnosis of the known RD caused by mutations in the gene found as causative. Similarity algorithms were able to find the candidate clinical diagnosis in 3 out of 4 cases within the first 30 results (at positions 11, 19 and 27 for KIF5A-, CASQ1- and SPAST-related cases respectively). TBL1XR1-related ORPHAcode (Pierpont syndrome, ORPHA:487825) could not be found by phenotypic similarity algorithm. After phenotype reannotation, Step A1 was re-run and the correct clinical diagnosis could be found at rank 15.
In Step A2 no other (likely) pathogenic variants were identified in the KIF5A-related case, which shown perfectly consistent with the Orphanet annotation of Autosomal dominant spastic paraplegia type 10 (ORPHA:100991)29 in Step A1. As SPAST- and CASQ1- related cases were not completely consistent with the Orphanet-matching disorders based on their causative gene, we formulated the hypothesis that a double anomaly could be at the origin of the patient’s atypical phenotypic presentation. Step A2 performed for the SPAST-related case suggested that a homozygous deletion in LDHA, causing Glycogen storage disease due to lactate dehydrogenase M-subunit deficiency (ORPHA:284426)30 deserved further investigation of this case that cannot be diagnosed as a Pure spastic paraplegia type 4 (ORPHA:100985).
Only one out of two possible CASQ1 corresponding disorders (ORPHA:88635) was found based on phenotypic similarity. Discussion with clinicians led to the conclusion that, because of lack of perfect consistency between the case and the possible diagnoses, other genes could be involved in the phenotype which resulted not completely solved. A2 step performed for this case retrieved another variant in the TTN gene, classified in ClinVar as “conflicting interpretations of pathogenicity”, that is currently under further investigation.
As a conclusion of Step A, running phenotypic similarity algorithms allowed to unveil phenotypic description quality issues and highlighted the need for more advanced analysis in order to provide a clinical diagnosis before considering cases as totally solved.
Step B1 was intended to use the information from solved cases to help solving the surrounding unsolved ones. Candidate variants were found for three unsolved cases related to SPAST-, TBL1XR1- and KIF5A – triggering solved cases respectively. Of those, after discussion with clinicians only the KIF5A candidate variant (c.226G > C, p.Ala76Pro, (Supplementary Table A)) was a promising non-sense, likely pathogenic, variant. It is located in the part of the gene coding for the protein motor region and it was found in a case at rank 16 of similarity, with a phenotypic description of pure spastic paraplegia. Despite the fact that parent’s DNA is not available and no functional analysis can be carried out, this finding is likely to solve the case.
The SPAST missense variant identified (c.134C > A;p.Pro45Gln, (Supplementary Table A)) in an unsolved case within the SPAST cluster bears conflicting evidence of interpretation in reference databases and it does not segregate in the other symptomatic siblings; therefore, this case remains unsolved.
The variant found in the TBL1XR1 gene (c.1184T > A (p.Tyr395Phe)) for the unsolved case in the TBL1XR1 case phenotypic cluster did not explain the typical Amyotrophic lateral sclerosis (ALS) phenotype of the patient based on the feedback from the clinical expert, although it is described as likely pathogenic, pointing out the need for variant reclassification.
Step B2 further examines step B unsolved cases for prioritised genes involved in phenotypically-related Orphanet disorders. The number of detected variants out of the total number of genes examined is given in Table 1. A variant of interest (c.136G > T (p.Asp46Tyr)) was found in the GALC gene in a both SPAST- and KIF5A- clusters, known to be causative of Krabbe disease (ORPHA:487)31, partially explaining patient’s phenotype, but unfortunately additional information on patient’s evolution was not available to confirm or to infirm the diagnostic hypothesis. All the other selected variants were finally discarded because either classified as likely benign or considered not explanatory of the phenotypes by the clinicians.
As a conclusion, Step B results were modest but potentially allowed for two unsolved cases to be explained and reconsider the phenotypes from a new perspective because of the unexpected variants found.
Step C was intended to use the solved case-Orphanet disorder similarity to expand the phenotypic clusters and therefore the genes proposed to be re-examined. In step C1, the causative gene for both the triggering solved case and its related Orphanet disorder was examined. Five SPAST pathogenic variants were found in 5 cases clustering around Autosomal spastic paraplegia type 4 (ORPHA:100985) that were solved in the meantime, and were therefore considered as positive confirmations of our analysis. Five KIF5A candidate variants were found in four unsolved cases in a SPG10 (ORPHA:100991)-centred cluster (Fig. 3). Despite their classification as likely pathogenic in ClinVar, they were discarded because not in the motor region of KIF5A gene, suggesting a variant misclassification in reference databases. In one case, already published as known to present a variant in KIF5A but still unsolved32, that was clinically consistent with SPG10 phenotype, a variant (c.1373C > T, (p.Ser458Phe) was identified and discussed. Clinicians agreed on the diagnostic hypothesis but suggested that the study of further cases is needed before certifying the variant pathogenicity. No variants were found in the CASQ1-related ORPHAcodes centric clusters and it was not possible to perform Step C starting from the TBLXR1 case as at the moment of analysis because no result was found in Step A1.
Figure 3
In Step C2 each unsolved case belonging to the reference ORPHAcode environment, and not explained by Step C1, is reanalysed for the gene causative of its top 50 most similar ORPHAcodes. In the ORPHA:100991 (SPG10) environment, a pathogenic variant in VCP, known to be associated with amyotrophic lateral sclerosis33 was found for an unsolved case, which appeared consistent with patient’s clinical presentation. As the initial phenotypic description was limited, it was decided to perform Step A after case reannotation. ORPHA:803 (ALS) was then found at rank 24 by the similarity algorithm. A diagnostic confirmation for this case is expected after patient’s re-examination.
In conclusion, Step C identified a number of candidate variants that triggered re-investigation of patients both from a clinical and molecular point of view.
In total, 725 cases (14.5% of the study population) were found by phenotypic similarity in the top-50 rank in at least one of the pipeline steps and were further analysed for variants detection. Variants of interest (pathogenic or likely pathogenic) were found in 64 out of 725 cases (8.8%) thus leading to the formulation of diagnostic hypotheses. These hypotheses were validated for 42.1% (27/64) of those cases, considered as solved. In 7 cases (10.9%) the diagnostic hypotheses raised were considered as relevant by the clinicians, but require additional analyses.