Mathematical model validation
Descriptive statistics and frequency analyses of premutated allelic scores are shown in Table 1. The PM alleles median allelic scores did not show statistically significant differences between sets (Kruskal - Wallis test: H = 1.45; df = 2; p = 0.484) (Supplementary Fig. 1d); despite significantly different median normal-sized alleles allelic scores were (Kruskal-Wallis test: H = 33.20; df = 2; p < 0.001) (Supplementary Fig. 1b). To compare PM samples behavior with that of the previously published using the same mathematical model, samples from that publication were used as a reference (Rodrigues et al. 2020). A distribution into four quadrants, separated by an allelic score of 150 was obtained in all sets, similarly to that obtained in the reference set, revealing the same distinct behaviors (Fig. 1): samples with alleles showing a similar complexity (equivalent group, quadrants 1 and 3) (Fig. 1a) and samples where alleles have a different complexity (dissimilar group, quadrants 2 and 4) (Fig. 1b). Thus, the correlation between the allelic score of each allele (Fig. 1a and b) was described following a logarithmic model. Significant correlations were found in both group from all sets: reference set – equivalent group: r = 0.551; df = 71; p < 0.0001 and dissimilar group: r = 0.466; df = 54; p < 0.0001 (Fig. 1a and b, represented by circles); set 1 – equivalent group: r = 0.994; df = 8; p < 0.0001 and dissimilar group: r = 0.991; df = 6; p < 0.0001 (Fig. 1a and b, represented by squares); set 2 – equivalent group r = 0.933; df = 26; p < 0.0001 and dissimilar group: r = 0.938; df = 27; p < 0.0001 (Fig. 1a and b, represented by lozenges), and set 3 – equivalent group: r = 0.912; df = 187; p < 0.0001 and dissimilar group: r = 0.882; df = 297; p < 0.0001 (Fig. 1a and b, represented by triangles). An exponential growth of the allelic score was observed, particularly in those alleles having more than two AGGs (Supplementary Table 3); due to the relevance attributed to the AGG number by the mathematical model. For instance, samples with three AGGs show scores above 700 (n = 6, reference set, and n = 8, set 3; represented by a grey circle in Fig. 1a and b). To validate the mathematical model in expanded alleles a covariance analysis between the reference and PM samples sets logarithmic model was performed separately for each group (Supplementary Fig. 2). Supplementary table 4 shows the individual models resulting from each set. Coincident regression lines demonstrate the absence of statistically significant differences in each equivalent and dissimilar groups from PM samples sets when compared with those of the reference set. This result supports a more robust model including observations from the four sets: equivalent group – F (6, 300) = 1.8278; p = 0.0934: Score 2 = -238.3 + 87.4 × ln(score 1) and dissimilar group – F (6, 392) = 1.0679; p = 0.3812: Score 2 = 573.9–88.4 × ln(score 1).
Table 1
Summary of the FMR1 allelic complexity (allelic score) results
| | A1 - Shorter CGG repeat length allele | A2 - Longer CGG repeat length allele |
| | Set 1 | Set 2 | Set 3 | Set 1 | Set 2 | Set 3 |
Number of alleles | 40 | 118 | 996 | 40 | 118 | 996 |
Allelic score | Mean (± S.D.) | 159.0 ± 66.5 | 165.8 ± 79.8 | 159.7 ± 105.5 | 152.7 ± 81.8 | 150.2 ± 81.7 | 131.0 ± 61.6 |
Median | 205 | 207 | 193 | 217 | 214 | 109 |
Range | 49–206 | 16–234 | 9–829 | 56–242 | 63–288 | 55–313 |
Most frequent (%, n) | 205 (30%, n = 6) | 223 (27.1%, n = 16) | 205 (34.9%, n = 174) | 231 (15%, n = 3) | 133 (5.1%, n = 3) | 103 (2.4%, n = 12) |
206 (25%, n = 5) | 207 (18.6%, n = 11) | 189 (12.7%, n = 63) | 217 (15%, n = 3) | 83 (5.1%, n = 3) | 100 (2.4%, n = 12) |
Allelic scores Kruskal-Wallis Test | H = 33.20; df = 2; p < 0.001 | H = 1.45; df = 2; p = 0.484 |
S.D. = Standard deviation; % = Frequency; n = Number of alleles | | | | | | |
p values represent significant levels between sets 1, 2 and 3 allelic scores; Multiple Comparison (Dunn's Method) results in Supplementary Fig. 1b and d | | | | |
Data published in Villate et al. 2020 (set 1), Allen et al. 2018 (set 2) and Yrigollen et al. 2014a (set 3) | | | | | |
In summary, the validation of our mathematical model in females with FMR1 expansions showed that this model can be applied in populations that exhibit varied genotypic characteristics, namely premutations.
FMR1 allelic scores and age at amenorrhea association
To understand the impact of FMR1 allelic score on the age at amenorrhea (set 2 samples; n = 58), normal and PM alleles were analyzed separately. No significant correlation was observed between normal allelic score and age at amenorrhea (p > 0.05) (Supplementary Fig. 3a, c, e and g). The same was observed in PM samples (p > 0.05) (Supplementary Fig. 3b, d, f, and h). The lack of statistical significance is probably linked with the reduced number of samples as after the separation of samples according to the distinct behaviors few samples remain in each quadrants (Bonett and Wright 2000). A nearly significant trend (p = 0.058) is apparent between the normal allelic score and age at amenorrhea in samples with normal allelic scores between 206–234 (Supplementary Fig. 3a) and 16–68 (Supplementary Fig. 3e) (quadrants 1 and 3, respectively). The influence of FMR1 gene alleles within normal size in ovarian reserve is controversial. Gleicher et al., demonstrated a negative effect of alleles with less than 26 CGGs in ovarian reserve, evidenced by low levels of anti-Müllerian hormone (AMH) (Gleicher et al. 2009, 2015; Maslow et al. 2016). Wang et al. demonstrated reduction in FMR1 mRNA levels in granulosa cells in females with low repeats (CGGs < 26), misregulation the expression of steroidogenic enzymes and hormone receptors, leading to ovarian dysfunction and, ultimately, infertility (Wang et al. 2017). Rechnitz et al. illustrated that female carrying alleles with < 26 CGGs showed poor response to ovarian stimulation, elevated expression of FMR1 mRNA in granulosa cells when compared to samples with different FMR1 gene sub-genotypes (Rehnitz et al. 2018). Two distinct behaviors were observed: age at amenorrhea rise with increasing allelic score (above 200, quadrant 1, mean age at amenorrhea 40 ± 8.5 years, Supplementary Fig. 3a), and age at amenorrhea decrease with increasing allelic score (below 70, quadrant 3, mean age at amenorrhea 38 ± 8 years, Supplementary Fig. 3e). Interestingly, the majority of these samples have alleles with less than 26 CGGs (78.6%, n = 11), with one or none AGGs (71.4%, n = 10, 28.6%, n = 4, respectively), whereas those with higher allelic scores have alleles ranging from 29 to 32 CGGs, with two AGGs. Lekovich et al. demonstrated that premutated females with none or one AGG showed an ovarian reserve poorer that those with two AGG interruptions, suggesting a protective effect (Lekovich et al. 2018). It is thus tempting to speculate that by a similar mechanism, the absence of AGGs in normal alleles can correlate with ovarian dysfunction.
Age of amenorrhea assessment by allelic scores combination
PM alleles within the range 70–100 CGGs are known to have increased risk of developing FXPOI (Ennis et al. 2006; Mailick et al. 2014; Allen et al. 2021), however not all carriers develop disease and there is lack of knowledge on underlying mechanisms. This led us to speculate if FXPOI development could be associated with a combined effect of FMR1 allelic complexity. To analyze the joint effect of normal and premutated allelic scores in the age at amenorrhea, a contour plot was performed. Overall, different trends were observed: menopause age approaches normal (mean 51 years, range 40 to 60 years) when the allelic score of both alleles increases or decreases, showing that balanced allelic scores have minimal impact on early amenorrhea. Deeper analysis of samples with mean age at amenorrhea below 40 years and premutated allelic scores between 70–123 show that age decreases with increasing normal allelic scores (Fig. 2, normal allelic scores between 50–55). Despite the absence of a statistical significance, a trend towards a correlation with the allelic score of the normal allele suggests the need for larger-scale investigations to assess the impact of the combination of the allelic scores on the age at amenorrhea. Furthermore, it is likely that the age at amenorrhea may not provide a comprehensive assessment of FXPOI development. Therefore, it is important to test other clinical indicators of FXPOI, such as FSH levels, to gain a deeper understanding of the impact of combining allelic scores on disease development.