Diagnostic delay in rare diseases: a documented list of (296) rare diseases for which delayed diagnosis would be especially detrimental, based on the French situation

A timely diagnosis is a critical step to ensure a proper access to expert clinical management for patients. However, diagnosing rare diseases (RD) is a major challenge, as these diseases are extremely diverse in their expression, cause, semiology and nosology. Today, the development of digital technologies offers genuine opportunities for improving diagnosis and care in a sector with urgent needs. However, developing and testing digital solutions would only be possible for a limited number of Rare Diseases (RD). The approach presented in this article aims at proposing an ethical and rational way of dening a subset of “priority” rare diseases to focus on, based on pathologies for which an established and effective standard of care management is dened. Two types of management were considered: the existence of a medicinal product specically targeting the disease; and / or the existence of authoritative clinical guidelines in France. Our work led to the establishment of a list of 251 RD for which a delayed diagnosis would be especially detrimental. It remains now to establish before targeted to reverse the situation. Clarifying choices when taking initiatives to develop solution in a eld with so many unmet needs is an element of an ethical approach.


Introduction
Diagnosing rare diseases is a major challenge. Rare diseases (RD) whose de nition is based on a prevalence notion are not only numerous (more than 7000 are described, mostly with a genetic origin) but also extremely diverse in their expression, their cause, their semiology and nosology and, for most, their natural history are not rmly established. Many rare diseases share symptoms with "common" diseases. Despite increasing knowledge and new imaging or biological and molecular technologies, diagnosis remains di cult, even for the best experts.
As a consequence, the long lag time between rst symptoms and diagnosis is identi ed as a key problem to be xed, especially pointed out by patient organizations (1). A timely diagnosis is a critical step to ensure a proper access to expert clinical management for patients suffering from rare diseases. Currently, this delay is thought to be unacceptably long, and amenable to improvement if appropriate measures are undertaken. The reasons for such a delay are diverse and cumulative.
A delayed diagnosis can occur because the patient did not consult early enough, or because the symptoms are non-speci c or uncommon for the disease to be considered. A delay can also occur because scienti c knowledge is still limited, or because all investigations have been performed without any conclusive result. These delayed diagnoses are not avoidable at a given time point.
In contrast, the determinants of the healthcare systems contributing to delays could be totally or partially addressed. Those may include health professionals' lack of awareness and of experience with RD, di culties in referring patients to expert centers, lack of specialized centers or too distant ones, understaffed expert centers, or limited access to genomic services.
Many initiatives have been taken in the past to address these issues. In Europe, Orphanet was speci cally established in 1997 to disseminate the information on RD and expert resources. In 2004, the French government adopted the rst Public Health Plan for rare diseases, including the establishment of a network of expert centers in academic hospitals and many other initiatives likely to contribute to a better diagnosis of RD (2). A recommendation of the Council of European Ministries was adopted in 2010, urging all European countries to set up a national plan or strategy for RD before 2014, a recommendation followed by most countries (3; 4).
With the progressive availability and affordability of Next-Generation Sequencing (NGS) technologies, the debate around solution for the diagnosis of rare diseases focused on the access to sequencing technologies and on accelerating the identi cation of diseasecausing genes by involving all undiagnosed patients in research protocols (5; 6). Improving the diagnosis of RD is an enormous challenge for public and private actors, as it is a multifaceted phenomenon, encompassing all aspects of medicine. But, today, the development of digital technologies offers genuine opportunities for progress: for patients and their caregivers, with new tools and options for dealing with their condition; for healthcare professionals with tools supporting their daily administrative, medical and research duties; for healthcare systems, with tools to optimize care. The sector of rare diseases is at urgent needs and the community is organized and dedicated enough to quickly adopt innovations which could improve patients' quality of life.
In this context, developing eHealth solutions that are speci cally suited to RD was identi ed by Sano France, in partnership with Orange Healthcare, as an obvious priority for its open innovation initiative, named UniR. A group of stakeholders, including patients, patient associations, healthcare professionals, researchers, administrators, healthcare and digital specialists, was invited, in 2018, to identify tangible eHealth solutions to reduce diagnostic delay. This open innovation initiative brought together 16 experts in rare diseases along with six representatives from national centers of expertise and four patient associations for rare diseases. After 30 individual interviews and three workshops, the group identi ed 13 obstacles that are sources of diagnostic delay, and suggested 14 digital-based solutions to reduce such delays. The outcome of this brain storming exercise is published as a white book (7).
During the process of deciding about the potential solutions to be developed in the short term, emerged the idea of de ning a subset of rare diseases to focus on, for practical reasons. As testing these solutions would only be possible for a limited number of RD, the most ethical approach would be prioritizing RD for which a delayed diagnosis would be especially detrimental, because an established and effective standard management is already de ned (including clinical guidelines and / or speci c drug). It is this approach that is now described. This choice does not imply that an absence of diagnosis, or a very late diagnosis, is not detrimental in the context of other diseases. Of course, it is the case for all of them. The current approach just aims at proposing a rational way of choosing rare disease on which pilot projects will be tested, addressing so the most urgent needs.

Materials And Methods
De nitions and sources. In an attempt to de ne a subset of "priority" rare diseases to focus on, it was decided to concentrate on missed opportunities for patients affected with a pathology for which an established and effective standard management is de ned. This required the identi cation of established speci c care management options. Two main types of care management have been retained in this work: the existence of a medicinal product speci cally targeting the disease; and / or the existence of authoritative clinical guidelines.
A targeted medicinal product was de ned as a medical product with a Marketing Authorization (MA) with designation for one or more RD (Orphan drugs and non-Orphan drugs); and products in development available as part of an Authorization for Temporary Use in France (ATU). These authorizations are given, prior to the MA granting, for the exceptional use of experimental pharmaceutical products that do not have MA and for patients that cannot be included in a clinical trial (8).Two open access sources of information were used: the list published by Orphanet, of Orphan (OD) and non-Orphan (NON-OD) drugs intended for RD and with a Marketing Authorization in the European Union (EU) as of July 2017 (Source #1) (9); and the list of drugs with an Authorization for Temporary Use (ATU) in France with on OD designation as of November 2017 (Source #2) (10; 11; 12).
Regarding authoritative clinical guidelines, we considered the protocols elaborated either by the French National Authority for Health (Haute Autorité de Santé, HAS) or by the French Rare Disease networks (FSMR) following the methodology elaborated by the HAS.
These protocols are syntheses of published good practices about a rare disease, or a group of rare diseases, followed by recommendations. Their objective is to guide healthcare professionals (HCP) for an optimal diagnostic and therapeutic management. Two open access sources of information were used: the list of National Diagnosis and Care Protocols (NDCP) published by the HAS (Source #3) (13); and the list of NDCPs written or under writing by the 23 FSMR according to their websites (Source #4) (14).
Finally, the identi ed pathologies were matched with Orphanet nomenclature database (Source #5 (15). The detail of the information sources used in this work is available in the Supplementary Information section.

Methodology.
The methodology designed to establish this list of pathologies was based on four main steps ( Figure 1): #1: Identi cation of RD for which a commercial drug with a MA is available, #2: Identi cation of RD for which a drug is available as part of an ATU, #3: Identi cation of RD with a published or under writing NDCP, #4: Merger, duplicates removal and mapping of pathologies with the Orphanet nomenclature. #1: Identi cation of RD for which a commercial drug with a MA is available. The "source #1" tables encompassed 256 drug entries: drugs with Orphan Drug (OD) designation (98 entries) and drugs without Orphan Drug (NON-OD) designation (158 entries) (9). For each drug entry, the Marketing Authorization description was manually processed to extract the names of the RD targeted, resulting in 371 "drug x RD" entries. Duplicates were removed using both Excel automatic tool then manual processing (107 duplicates merged, 264 unique RD entries remaining). Rare cancers were discarded from the nal table (167 RD entries remaining -97 cancer entries discarded) as they are not considered for the production of clinical guidelines and are supported outside the rare disease networks. Conditions linked to the administration of medicinal products were also excluded: anthracycline extravasation, methotrexate toxicity and hepatitis B reinfection following liver transplantation (164 RD entries remaining -3 RD entries discarded).
#2: Identi cation of RD for which a drug is available as part of an ATU. To ensure an exhaustive listing of drugs with an ATU available in France as of November 2017, two sources (10; 11) were merged (281 drug entries remaining). Drug products for which an end-date of ATU was already ruled were discarded (224 drug entries remaining -57 drug entries discarded). The table was then compared with the EMA Orphan drug designation table (12), which included all products with an ongoing application for the "Orphan Drug" status by the EMA. Given the difference of language between the sources, the two tables were compared based on the "Active Substance" (66 drug entries matched: 41 automatic matches + 24 manual additional matches). The "Orphan Drug" designation which had a "withdrawn" or "negative" status were excluded (55 drug entries remaining -11 drug entries discarded). A search of the RD targeted by the 55 products was then carried out in the EMA Orphan drug designation table (column "Disease / condition") (12). Duplicates were manually removed. Finally, rare cancer entries were excluded (68 RD entries remaining -6 cancer entries discarded).
#3: Identi cation of RD with a published or under writing NDCP. To ensure an exhaustive listing of drugs with a NDCP, both sources (16; 14) were merged (104 NDCP entries remaining) and completed with the list of NDCP in the process of drafting and / or planned according to the FSMR websites (160 NDCP entries remaining). For each NDCP entry, the description was processed to extract the names of the targeted RD (160 RD entries).
#4: Merger, duplicates removal and mapping of pathologies with Orphanet nomenclature. The three RD tables previously obtained were merged (336 remaining RD entries -59 duplicate entries merged). For the 336 RD entries, a search for correspondence with the Orphanet nomenclature was carried out. A con dence index was introduced to characterize the degree of certainty on the correspondence (High / Medium / Low): 248 matches with a "High" correspondence (74%), 39 matches with a "Medium" correspondence (12%) and 28 matches with a "Low" correspondence (14%) were found. The list was nally reviewed by an expert on rare diseases, with proposals for modi cation, grouping or removal of pathologies. An output table including 273 RD entries was nally produced.
Information on each RD (ORPHA number, ICD 10 code, synonyms, inheritance, age of onset and prevalence) was then collected from the Orphanet database for the purpose of producing statistics. The inheritance codes were simpli ed in three categories: "Genetic origin" encompasses all diseases with a genetic origin whatever the mode of inheritance. "Partially genetic" includes diseases with a mix of different possible origins, some being genetic, some being acquired. "Non genetic" includes all other diseases, although some of them may have some genetic determinants as minor co-factor. The pathologies were classi ed by broad categories, following the logics applied in the International Classi cation of Diseases in its 11th edition (17).
The detailed list of the RD identi ed in this work is available in the Supplementary Information section.

Results
A total of 273 rare diseases, disorders and conditions were identi ed as satisfying the criteria of being particularly sensitive to a delayed diagnosis, by loss of opportunity to bene t from an appropriate care management option. This list included some infectious diseases (11 RD) which were not considered further, as posing very different problems. It included also isolated major malformations (9 RD) which are quite obvious at birth, but also trisomy 21 which is now easily diagnosed, and familial patent arterial duct, which is not posing a diagnostic issue. These conditions were excluded from the analysis as irrelevant in the framework of this project.
The nal list includes 251 conditions, classi ed in broad categories ( Table 1) Table 2). On the contrary, inborn errors of metabolism rank high because of the large number of marketed drugs, despite a small number of clinical guidelines. In all categories, the number of RD with both a marketed drug and clinical guidelines is very small (15%). Table 1: List of rare diseases for which a delayed diagnosis would be especially detrimental, in the context of the study.    The proportion of RD with a genetic origin is 68.9% in this list, comparable to the 75% for the whole set of RD in the Orphanet database ( Table 3). Most of the 251 conditions are pediatric disorders (Figure 3), which is similar to what is generally described in RD. Most of the 251 conditions are very rare (74 RD, 41.8%), or ultra-rare (58 RD, 32.7%), as displayed on the distribution of prevalence ( Figure 4).

Discussion
This work is based on open access data sources. Even if the study gives relevant results, these sources have de facto several limitations. They may suffer from non-completeness. They are not fully interoperable. The Orphanet prevalence data are estimates only and cannot be considered as established. Thus, it is possible that the prevalence is overestimated, as epidemiological studies are generally based on hospital data in regions with higher prevalence (18).
Regarding treatments of the data, the main bias comes from the many manual processing operations performed, particularly because of non-interoperability between sources, as well as their difference in language and denomination of pathologies. This bias is particularly important for the manual extraction of RD from MA (source #1) and in the crossing between the drug with an ATU (France) and OD designations list from the EMA (source #2).
The grouping of conditions is largely arbitrary. The rational depends on the purpose of the classi cation, as the same disease can be considered from several angles, such as the main affected function, the medical specialty caring for patients, the pathophysiology at stake, the etiology, etc. (19; 20) . For this project, it was decided to be as close as possible from the ICD 11 classi cation system, as it is the most recent attempt to establish an international consensus (21). However, many choices are disputable. For example, Neuro bromatosis type 1 is classi ed as a dermatological disease when it could be also in the developmental anomaly group. Turner and Klinefelter syndrome are considered here as endocrine disorders, when they could also be considered as developmental anomalies. Glucose-6-phosphate dehydrogenase de ciency is in the group of hematological conditions when it could be in the inborn errors of metabolism group. Alpha-1-antitrypsin de ciency is here as hepatological disease and could be a pneumological disease for instance.
In addition, the dataset is a snapshot of the situation as of January 2018, based on information sources from July 2017 to January 2018. It may not be representative of the current situation.
The work was based on the French situation, in particular concerning the RD for which a drug with an ATU and clinical guidelines are available. An extension of the scope may have been considered to ensure a better completeness of the list. However, this work was done in order to de ne a list of RD on which pilots could be tested, starting at national level, because each healthcare system is speci c. Especially, the choice of considering only French NDCP is questionable. However, NDCP production was a measure of the rst French National Plan for RD (2), assigned to the French National Authority for Health (HAS). At the time of the second national plan for RD, the assignment switched to the national centers of reference, to speed up the production, with an obligation to follow the HAS well-de ned procedure (22; 23). The limited resources and the heaviness of the process obliged the RD experts to de ne priorities, based on the existence of heterogeneous practices and on the existence of speci c measures to be taken, an approach very close to the one of this project. This justi es the choice of this criterion in the current study. However, at International level, other organizations produce relevant clinical practice guidelines, such as learned societies and, in Europe, European Reference Networks and other national agencies, such as Highly Specialized Services in the UK (24). They could have been considered as well. If this pilot is successful, other similar projects could emerge in Europe, based on other de nitions of authoritative clinical guidelines. An extension to medical products in clinical trials at European and / or international level could also have been considered.
Despite these limitations, this study comforts the choice of the two indicators (drugs / clinical guidelines) used for selecting RD to focus on for the development of eHealth diagnostic solutions. The two indicators are very differently distributed among the RD groups (Table 3). In general, most of diseases have either a speci c drug or clinical guidelines, when only 39 of them bene t from both ( Table 2). The existence of clinical practice guidelines for RD is, therefore, an independent criterion from the existence of a targeted new therapy, as half of the prioritized RD in the study has been picked up due to the existence of clinical guidelines only.

Conclusion
This study led to the establishment of a tentative list of 251 RD for which a delayed diagnosis would be particularly detrimental. It remains being established whether the diagnosis of these RD can be considered as especially delayed, or not, before setting up targeted initiatives to reverse the situation. The next step could be, for instance, to validate this list using the data from the French national RD database (BNDMR) (25). The time to diagnosis for these diseases will show whether this delay is acceptable, or not, according to the opinion of RD experts. If not, this will clearly indicate that eHealth solutions should be considered in priority for those RD. Clarifying choices when taking initiatives to develop solution in a eld with so many unmet needs is an element of an ethical approach. Availability of data and materials All data generated or analyzed during this study are included in this published article and its supplementary information les.

Competing interests
Anne-Sophie Chalandon and Christian Deleuze are Sano employees and may own stocks or stock options. Ségolène Aymé and Pierre-Etienne Chazal are external and may own stocks or stock options Funding Sano Aventis France will pay the publication fees.
Authors' contributions PEC: conception of the project, identi cation of information sources, data extraction and analysis, and writing of the manuscript ASC: conception of the project, identi cation of information sources, and reviewing of the manuscript SA: data analysis, review of the RD list from a medical perspective, and writing of the manuscript CD: sponsorship for this study provided by Sano  Figure 1 Decision-tree to identify rare diseases for which a delayed diagnosis would be especially detrimental, using existing open access sources on information, in France, on drugs intended for rare diseases and on clinical management guidelines Intersections between the criteria used to select rare diseases for which a delayed diagnosis would be especially detrimental, in the context of the study Page 18/19

Figure 3
Distribution of the age of onset of the rare diseases for which a delayed diagnosis would be especially detrimental, in the context of the study.

Figure 4
Distribution of the classes of prevalence of the rare diseases for which a delayed diagnosis would be especially detrimental, in the context of the study Supplementary Files