Mendelian is a UK-based health data analytics company focused on shortening the diagnostic odyssey of rare and hard-to-diagnose diseases. Mendelian has developed a digital case-finding tool, “MendelScan”, that can analyse structured clinical vocabulary, such as SNOMED CT codes [17] from primary care electronic health records (EHR) and highlight patterns of data that correspond to an increased likelihood of the patient being affected by certain RD. This enables the identification of those at risk and assists their clinician in accessing the correct diagnostic pathway. The “MendelScan” system is summarised in Figure 1.
The pilot study took place between January 2019 and October 2020. The primary objective was to assess the feasibility of applying “MendelScan” with seventy-six rare disease algorithms (see Appendix 1), at a small scale in a primary care environment in the Lower Lea Valley (LLV) primary care GP Federation.
The process for delivering “MendelScan” into the selected primary care federation involved establishing agreements, deploying the algorithms into a pseudonymised data set, manually reviewing the EHR identified by the algorithm, delivering the reports to GP and collecting their feedback. Figure 2 summarises the implementation process.
2.1 Primary care EHR access
2.1.1 Ethics and information governance
To enable data access and confidence in this study an independent ethical analysis of this approach was commissioned [18]. Building on the outcome and recommendations of this report, and in compliance with information governance legislation, a data sharing agreement (DSA) was agreed between stakeholders (Mendelian, Medeanalytics, East & North Hertfordshire NHS trust & Lea valley primary care network) The DSA is a contract that stipulates the rules regarding usage and handling of data. Finally, a Data Protection Impact Assessment (DPIA) was drafted. identifying and minimising the data protection risks of the project [19].
2.1.2 Data Transfer
Data transfer involved Medanalytics creating a data set of patients’ EHR, removing personal identifiers. Information removed included: Names, addresses and individuals who had opted out of sharing through the national data opt-out. For the remainder EHR a pseudonym, with a unique numeric identifier was created. This pseudonymised dataset of records was sent to Mendelian for analysis.
2.2 Algorithm deployment
Not all of the 7,000 - 9,000 rare diseases are appropriate for “MendelScan”. Mendelian developed an approach to stratify, which rare diseases are more likely to be suitable for primary care records analysis in the following steps:
1. Analysing the suitability of a RD using a scoring system based on features of the disease, the benefit of early diagnosis and the likelihood that relevant clinical characteristics would be captured in the primary care EHR.
- We developed a scoring system to identify rare diseases with a suitable profile for primary care electronic health records analysis. Scoring variables and metrics used are given in Table 1.
2. Performing a systematic literature review, searching for peer-reviewed screening or diagnostic criteria for the selected RD.
- We developed a scoring system for identified criterias to assess its suitability for primary care electronic health records analysis. Scoring variables and metrics used are given in Table 2.
3. Digitising the selected criteria into a numeric algorithm using structured data SNOMED CT codes, based on a combined scoring across several individual data points of information from the EHR. Data points are given in Table 3. We did not interrogate data held in unstructured formats (Free text) such as letters or consultation notes.
The “MendelScan” case-finding tool checked the seventy-six disease algorithms against the pseudonymised EHR data extracts flagging patients who met the algorithms’ criteria of being at risk of one of the RDs. Patients’ structured EHR flagged at risk by “MendelScan” for a rare disease, were then reviewed by a clinician and returned to the GP if a plausible alternative explanation for the clinical features could not be found.
2.3 Internal review of identified cases’ EHR
We performed an anonymous, two‐round manual review process for each EHR identified by any of the seventy-six algorithms deployed. In round one, a medical doctor reviewed each EHR and assigned to each case one of three outcomes:
- Rule-in: The medical doctor considers that there is enough clinical evidence to suspect the highlighted RD for this case.
- Rule-out: The medical doctor considers there are other diagnoses recorded in the EHR that explains the highlighted features.
- Already diagnosed: The highlighted EHR has a diagnostic code for the same RD it was screened for.
In round two, rule-in cases were further reviewed by a GP, geneticist or an expert in a particular rare disease and further assigned a rule-in or rule-out outcome. For each rule-in case, a patient report was generated and sent to their GP practice. The review process is summarised in Figure 3.
2.4 Returning reports to GP
A report for each of the ‘rule-in’ patients was returned to their GP by email. The report included the unique patient identifier, to enable matching to the patient’s full EHR, an explanation of the condition, the reasons why this patient was flagged, and suggested next steps.
2.5 GP feedback on reports
Feedback from the GP was requested at two stages. The first, ‘patient report feedback’, was requested as soon as the GP completed evaluating the patient’s report and EHR. This consisted of an online questionnaire, accessed through a link on each patient report (Appendix 2).
The first question asked the GP the main outcome of the report. See Table 4.
The second, patient outcome feedback was requested 3 months later requesting the result of those advanced for further evaluation. Figure 4.