Exome and genome clinical testing have led to genetic diagnoses for many patients who were on a proverbial diagnostic odyssey. Despite this, the majority of patients receiving what can be considered the most comprehensive germline test available remain undiagnosed. (Lee 2014, Yang 2014, Farwell 2015, Posey 2017) Our incomplete understanding of the genetics of disease to explain phenotypes presented by a patient at the time of testing inhibits interpretation of most variants identified when sequencing a patient. At the same time, clinical genetics is experiencing tremendous growth in understanding the etiology of disease, driven by widespread patient testing. New genetic findings are continuously being indexed in the literature and submitted to structured databases creating the opportunity that previously unexplained cases may be resolved with reannotation and analysis. Despite the obvious benefit of genetic test reanalysis for patients with unsolved disease, high costs, a paucity of methods, and limited resources have prevented wide-spread adoption of systematic genomic data review. We implemented the RENEW process to address this unmet need, providing an efficient strategy for automated reannotation and variant prioritization predicated on new data in ClinVar, HGMD, and OMIM. This approach provides information both at the variant and gene level that allows for efficient, systematic, and continuous review of unsolved cases.
The large number of new findings and subsequent database changes observed with each release of ClinVar, HGMD, and OMIM creates emergent opportunities for patient diagnoses. However, the high number of clinically irrelevant changes found in each database release creates a technical burden that limits the utility of any approach using unfiltered differential analyses. As an illustrative example, when evaluated the changes found in two OMIM releases (October 2021 to April 2022, Fig. 2C), 117 entries were added and 1,434 entries were modified, resulting in 9.7% of all database entries changed. Scrutiny of these changes reveals many modified entries which have no clinical significance, such as punctuation changes or edits that do not impact the variant/gene classification. It is therefore necessary to remove clinically irrelevant changes from the differential data. Furthermore, case-specific prioritization of the resulting variants is also needed to prioritize new information that may impact interpretation of the variants in light of the underlying case presentation. To illustrate this, of the 1,551 database entry changes noted in OMIM above, clinically meaningful information was only found in 59 genes. After using case-specific prioritization these entries translated into an average of 3 variants per case selected for review. In the absence of full reinterpretation automation, RENEW implements key steps to reduce the burden of systematic review by eliminating irrelevant changes and prioritizing new information that is most likely to lead to a clinical diagnosis.
We evaluated RENEW with 25 case examples with an uninformative ES clinical report and subsequent clinical diagnostic result identified through in-depth manual review and research studies by our group (Table 2). The RENEW process successfully prioritized the causal genetic findings for all cases, independent of the sample set configuration or MOI expected for these cases. Neurological phenotypes were predominant in these cases. However, as RENEW does not use phenotypic information for variant filtering or prioritization, process performance should not be impacted by the phenotypic makeup of the test cases.
In a prospective analysis of ongoing clinical translational activities, RENEW processed a cohort of 1,066 unsolved cases, calculating the Ann-Δ for each specific case starting from the date of the original clinical report (new cases) or the date of the last RENEW analysis (6 months prior), (prior cases) to April 2022. 5,741 variants in 945 cases were selected for review and presented in a structured report including case relevant information. Seven expert variant scientists completed the analysis in ~ 32 hours, with an average of 5.34 hours per person and 20.1 seconds per variant. The prospective study at scale, illustrates the highly efficient strategy for reviewing large case cohorts for putative new diagnostic findings and suggests this approach can be systematically used at regular intervals. The prospective cohort analysis with RENEW resulted in 4 (0.4%) cases with a putative new diagnosis and high value variant finding for 63 (6.6%) additional cases that were prioritized for additional study (Fig. 4D to F).
A higher number of variants were identified for review in new cases compared to prior cases (Fig. 3A), likely due to new cases having a larger Ann-Δ. However, there was no direct correlation between time elapsed since the original clinical report and number of variants (Fig. 3B). Similarly, we did not observe differences between reclassification groups according to the time since clinical report (Fig. 4F). Therefore, we cannot establish an optimal timelapse to maximize the chances of getting a new diagnosis. This may reflect the limited sample size in the New Diagnosis group (4 new solves) limiting the ability to draw conclusions at this point, or it could reflect the serendipitous nature of new findings collected in public databases that do not follow any specific pattern. This point is illustrated by the AFG3L2 example, where two cases with very different timelines were solved by one new clinical genetics database entry (Fig. 5). Follow-up studies will help clarify any conclusions between length of Ann-Δ and likelihood of identifying a new causative finding.
The yield of new diagnoses via the RENEW approach is lower than previously reported manually reviewed data studies, but there are several key differences that may explain this discrepancy. RENEW was designed to facilitate systematic implementation of case review at scale, relying on availability of new data in ClinVar, HGMD, and OMIM. Thus, if new clinically relevant information has not been submitted or included in these resources (e.g. a publication not yet curated or in non-publicly available institutional databases), it will not be detected by our method. RENEW is not designed to replace an in-depth manual case review as described in previous reports but is optimized to identify clinical genetic findings that would have been easily identified if the relevant data were available at the time of the original clinical test.
Since the first use of the RENEW approach in our institution 2 years prior to the example described here, we have been able to continuously identify new diagnosis (Table 4) in a resource-efficient manner. Extrapolating from our example where an average of 5 variants were prioritized per case, with an average of 20 s per variant for review (100s per case), we can estimate the cost and benefit of this method at a larger scale. Considering a health system with 10,000 patients analyzed through RENEW, it will take 106 s to review the resulting 50,000 variants every 6 months. This equates to approximately 272 total hours, or approximately one work week for a group of 7 experts. Based on the 2 year experience with RENEW at our institution, we expect ~ 0.9% newly identified causal findings, corresponding to 90 patients with newly resolved clinical genetics diagnoses in this example. Arguments can be made on either side as to whether dedicating just 7 variant curators for a week is cost effective for resolving 90 patients’ conditions every 6 months.
Table 4
Historical series of RENEW analysis indicating percentage of new diagnosis identified.
Date of RENEW reanalysis | Number of cases | New diagnosis | Percentage of new diagnosis | Total review time |
Oct-20* | 589 | 3 | 0.65% | 25.14 h |
Apr-21 | 461 | 7 | 1.52% | 15.49 h |
Oct-21 | 665 | 5 | 0.75% | 20.65 h |
Apr-22 | 1066 | 4 | 0.38% | 32.06 h |
*First use of RENEW. Additional changes to optimize the analysis were added in later iterations reflecting differences in total time per case. |
In addition to identifying new diagnoses, it was recognized early in the development of RENEW that several of the prioritized variants were good candidates to explain the patient’s phenotype, but the evidence was still insufficient at that point to make a confident conclusion (e.g. variants in emerging gene-phenotype associations). While beyond of the scope of the RENEW process, these variants were collated as “Interesting Findings” and represent rich targets for subsequent in-depth study and may ultimately lead to additional solved cases. In this example, 63 (6.6%) cases fell into this category, and have been assigned to another study for extensive investigation and research profiling. In this sense, the RENEW approach could also be considered as a screening tool to identify candidate cases to allocate the always limited resources for research investigation.
In summary, we present here a novel approach to reannotate and efficiently analyze genomic data by focusing on novel information being introduced in gene and variant databases. RENEW combines automated annotation and prioritization steps with a rapid review process to efficiently identify relevant genetic findings that lead to new genetic diagnoses and variant reclassification. As clinical genetic testing increases in scale and scope, strategies such as RENEW will be critical to meet the challenges of data analysis and interpretation burden over time and provide patients with timely and highest quality of care.