Although the expanded number of HTT CAG repeats is considered to be the major determinant of the AO of HD symptoms, other factors act together with the HTT CAG expansions. These factors can be stochastic, environmental or genetic. Although all these three factors may be involved to some extent in the determination of the AO, the genetic factors can be more easily studied due to the advances in molecular techniques (Gusella and MacDonald 2009).
Variants in CAA-CAG sequence downstream the HTT (CAG)n repeats can influence the AO in the following way: the exchange of one adenine nucleotide in a CAA codon (CAA>CAG), turning the region into an uninterrupted CAG region, is associated with dramatically earlier AO, despite the same polyglutamine length in individuals with the interrupting penultimate CAA codon. Besides that, another variant in this region (where the CAA-CAG sequence is duplicated) was associated with later AO. Identification of these cis-acting modifiers have potentially important implications for genetic counselling in HD-affected families (Wright et al. 2019).
A study shows that increased FAN1 expression of a nuclease involved in DNA interstrand cross-link repair, is significantly associated with delayed AO and slower progression of HD, suggesting FAN1 is protective in the context of an expanded HTT CAG repeat (Goold et al. 2019).
In this investigation, we found that expanded HTT alone is responsible for 65.7% of the AO. Furthermore, other studies have shown that other mechanisms can also be involved such as epigenetic-chromatin deregulation, as well as RNA toxicity and transcription aberrations (Marti 2016; Nalavade et al. 2013). The microtubule associated protein tau (MAPT), which is involved in several neurodegenerative disorders, has also been implicated in HD (Vuono et al. 2015). It is also important to mention that the expanded huntingtin (mHTT) shows a pleiotropic effect, as it is broadly present in different cellular compartments (e.g. cytosol, nucleus, mitochondria) as well as in all cell types of the human body at all developmental stages (Bassi et al. 2017).
In addition, we searched the number of TBP CAG / CAA repeats and its influence on the AO of motor symptoms in 72 HD patients. TBP has been suggested as a protein that plays an important role in the HD pathogenesis. The normal form of TBP has been found, along with the mHTT, at higher levels in the brains of HD patients than in control group (van Roon-Mom et al. 2002). We found 84% of TBP heterozygosity among individuals who were molecularly negative for HD and 81% for molecularly positive patients. A similar result described 78.5% of heterozygosity for the TBP alleles (Tomiuk et al. 2007).
In one of the first studies to investigate the allelic frequency of TBP CAG / CAA repeats showed that, in Algeria, these repeats varied from 32-39, and the most frequent alleles had 38 repeats. Black South Africans and Sub-Saharan Africans had alleles with 33-39 repeats, and their most frequent allele had 35 repeats. Indian people had CAG / CAA repeats ranging from 27 to 39 and their most frequent allele had 38 repeats (Rubinsztein et al. 1996).
An investigation of the number of TBP CAG / CAA repeats in different neurodegenerative diseases (Alzheimer, Parkinson, Huntington, Schizophrenia and different Ataxias) revealed that the TBP repeats, in the group of patients, varied from 27-46 (mean 36 ± 2; median 36). For the control group, the TBP repetitions ranged from 30 to 43 (mean 37 ± 1; median of 36) (Wu et al. 2005). On the other hand, in our study, we observed that molecularly tested HD negative individuals had 29-39 TBP CAG / CAA repeats and the most frequent TBP allele had 36 repeats. While molecularly positive HD individuals had 25-40 TBP CAG / CAA repeats and the most frequent allele had 38 CAG / CAA.
De novo expansions of CAG repeats in HD and SCA17 patients have been reported to occur from intermediate alleles, which contain uninterrupted pure CAG repeats (Myers et al. 1993). It is known that the TBP CAG/CAA repeat track is highly variable, the number of glutamines ranges from 25 to 42 in the American population. The CAA interruptions in TBP contribute to the stability of CAG polymorphic regions (II and IV) as suggested by Gostout et al. (1993) (Gostout et al. 1993), (table 2).
The implication of new modifier genes on the modulation at AO in Tunisian HD patients was investigated, such as the TBP (CAG/CAA) polymorphisms (Hmida-Ben Brahim et al. 2014). The authors took into account the genetic polymorphisms such as expanded HTT (CAG) together with TBP (CAG/CAA), as well as the expanded HTT CAG polymorphisms independently. They observed an increase of only 0.2% (∆R2) of variation in the AO in Tunisian HD patients when TBP was included (Hmida-Ben Brahim et al. 2014). We also investigated the same influence of TBP on the HD AO and observed a similar weak association (∆R2= 0.1%). It is important to mention that these two searches are comprised of Brazilian or Tunisian individuals who have been living in different environments and exposed to different mechanisms of selection. Furthermore, the two studies also differ in the sample size: a small sample size (n=15) (Hmida-Ben Brahim et al. 2014) and our sample of 144 TBP alleles (n=72 HD patients) (Table 3).
The normal number of TBP CAG / CAA repeats was described as modulator of the AO of spinocerebellar ataxia 7 (SCA7), with evidence that the largest number of CAG / CAA repeats leads to a decrease in the age of SCA7 onset. This ataxia is caused by ATXN7 gene which also codes for a track of glutamine repeats (Tezenas du Montcel et al. 2014). Furthermore, a study indicates that genotypes with more than 35 TBP CAG / CAA repeats have been associated with risk of schizophrenia, AO of the disease and prefrontal cortical function in the Japanese population (Ohi et al. 2009).
It is worth mentioning that the first polymorphic region of TBP structure (Region II) shows sequences with different CAG numbers from 6 to 47 (table 2). The NCBI reference (GRCh37.p13) only reports the region II as a sequence of eight repeats (CAG)8 and the CAG variations were reported as Inframe deletions; however they do not have clinical significance presented by ClinVar database.
We suggest that subjects bearing TBP structure 7.2 could have two SNPs (rs55736770 and rs62430309) in region III which would lead to uninterrupted CAG track with 45 to 47 repeats. Although considered as a region of higher instability (Imbert et al. 1994), variant rs55736770 is reported by ClinVar database as likely-benign. In addition, the clinical significance of rs62430309 is not reported in ClinVar database.
The alternative sequence (*1) localized between regions III and IV (rs112083427), present in 7.2 structure, has a mutation (CAG CA(G>A) CAG17) which is a variant classified as VUS by ClinVar. This TBP region has a polulational frequency of 24.9% in Europe, 23.4% in East Asia, 12% in Africa and 34% in the American Continent according to the Genome Aggregation Database (gnomAD).
The second polymorphic TBP region IV, according to GRCh37.p13, may vary from 10 to 24 repeats with benign or likely-benign clinical significance. Imbert et al (1994) (Imbert et al. 1994) consider normal TBP alleles with a maximum of 21 triplets at region IV. Table 2 shows that the second polymorphic region has a variation of 9 to 31 CAGs but structure 7.2 which has a expansion >40 CAGs. Our patient, who bore the TBP structure 8, had 17 uninterrupted CAGs between regions I and V. The variants which have more than 24 CAG repeats are not reported in the NCBI database.
The alternative region (*2), located between regions IV and V, was only described in a single study (structure number 3.2- table 2) where five G>A led to a CAG sequence interrupted by five CAA (synonymous variant). As a consequence of its presence, region IV has a smaller number of CAG repeats (9 repeats). Although, the variant sequence is reported in the NCBI database it does not have any clinical significance in ClinVar database. Its frequency is less than 1% in the world population (1000 Genome Bank).
Considering TBP as candidate to be a modifier gene, neither the TBP variants observed in the Brazilian HD patients, nor those reported by Goustout et al. (1993) (Gostout et al. 1993), are available in the ClinVar database. Consequently, we could not compare these different structures in relation to their clinical significance.
In conclusion, the rate-limiting mechanisms of AO in HD still remain elusive: many different processes are commonly disrupted in HD cell lines and animal models, as well as in HD patient cells (Bassi et al. 2017). It would be important if future searches further investigated the association of the size of TBP CAG/CAA expansions with or without CAA in the CAG polymorphic region of this gene, once the long CAG expansions have already been associated with loss of genetic stability.