Large-scale data (big data) in our case when referring to a rare disease have been adapted to the number of cases available; this kind of analysis is a new tool that has recently been incorporated into biomedical activity. This methodology is especially useful for obtaining pooled information on the diversity of outcomes and identifying prognostic factors potentially related to disease complications (32). In rare disease research, this is of particular interest due to the scarcity and the spread of the data among the different centers (33). Various approaches have been applied in the area of rare diseases, especially in looking for genetic associations (34) and making correlations between genotype and phenotype (35).
Registries have an important role in this kind of analysis, because they include complete information about patients, which is especially important for rare disease research. This collected information helps in diagnosis, patient management, treatment strategy planning, health care planning and follow-up. It enables the acceleration of research and paves new pathways for personalized medicine (36, 37).
This study is the first attempt to establish a correlation network among different biochemical and clinical characteristics in a national-base cohort. We have aimed to analyse diagnostic data and to relate them with long-term complications as bone crises, development of neoplasia or PD, which are the most common and disabling complications (38–41).
Two observations, already accepted in Gaucher research, were also confirmed in this machine-learning study: first, the fact that spleen-removal patients have a higher risk of presenting more serious and extensive bone disease; second, our observation that almost all patients with new bone crisis – despite having received long-term ERT – had previous bone lesions, which remind us that the most feared complication in GD1 are not solved merely by starting ERT. These two facts confirm previous reports and provide validity of our analysis (39, 42–44). In addition, genotypes different from homozygous NM_000175.4:c.1226A > G are significantly correlated with bone disease (p = 0.05). This last observation is in line with the observation that c.1226A > G variant provides a mild phenotype (45, 46).
It is a priority to identify accurate risk factors of bone crisis to improve treatment dosage and to avoid this complication. The standard biomarkers related to GD (ChT activity, CCL18/PARC and GluSph concentrations) have been discarded as risk factors for bone complication (47, 48) even though their concentration will be increased during bone crisis, due to the acute inflammatory event (42, 46). This reminds us of the importance to continue searching for other biomarkers. Our results confirm the lack of association between these biomarkers and disease outcomes, but other biomarkers, such as high levels of ferritin, show a tendency in patients with advanced bone disease although it was not statistically significant.
Surprisingly, the high serum IgA concentration correlates with the degree of bone involvement and with the development of bone crisis (p = 0.001). The age of onset of treatment (mean 30.6 y.o.) (p = 0.01) also shows a clearer relevance for the occurrence of bone crises (p = 0.01).
In this study, the development of malignancies appears strongly correlated with the delayed age at the start of ERT (p < 0.01) and the increased concentration of IgG (p = 0.01). Many aspects remain to be unraveled in the complexity of the immune system, but aging is an important factor clearly related to humoral immune dysfunction and the appearance of malignancies (49). Polyclonal and monoclonal gammopathies in GD patients are common (50) and we observed a significant correlation between high levels of IgG and the appearance of neoplasia (51). However, the origin of these alterations is not fully clarified, and is attributed to the chronic inflammation state; also, it is related to an increase in levels of inflammatory cytokines such as interleukins (IL-6, IL-10) that could lead to an overproduction of immunoglobulins (50, 51). Another hypothesis could be that B lymphocytes were activated by specific type II natural killer T lymphocytes, with a T follicular helper profile, and that the clonal immunoglobulin in GD patients and in mouse models of GD was reactive against GluSph (52).
The identification of levels of IgA as a risk factor for complication was a surprising finding; it has not been previously reported that IgA levels are related to severe bone disease and the presence of repeated bone crises in GD.
The SGDR only includes GD patients from Spain, thus the main limitations for the study are the absence of a larger data set. Despite this, the included data reflect the characteristics of the disease in this country. It could be interesting to validate these findings by studying other populations with a greater number of patients; however, taking into account the homogeneity of the series and the single-country origin, the data are solid.