Is There Any Inuence of TBP CAG / CAA Repeats on Huntington's Disease Age at Onset?

Huntington's disease (HD) is a genetic neurodegenerative progressive and fatal disease characterized by motor disorder, cognitive impairment and behavioral problems, caused by expanded repeats of CAG trinucleotides in the HTT gene. The aim of this study was to investigate the inuence of TBP gene CAG/CAA repeats in conjunction with HTT gene CAG repeats, on the age of HD onset in Brazilian individuals. Individuals diagnosed as molecularly negative for HD, presented 29-39 TBP CAG/CAA (mean = 36 ± 2; median = 36). The most frequent allele had 36 repeats. The heterozygosity was 84%. In individuals diagnosed as molecularly positive for HD, a range of 25-40 TBP CAG/ CAA was found (mean = 36 ± 2; median = 36). The most frequent TBP allele had 38 repeats and the heterozygosity was 81%. We also conducted TBP direct Sanger sequencing of some samples which demonstrated other TBP structures different from the wild-type. The HTT expanded CAG and TBP CAG/CAA repeat sizes jointly explained 66% of the age at onset (AO) in our HD patients. The strongest variable in the model associated to AO was the number of expanded HTT CAG repeats. The difference between the association of HD AO with HTT expanded CAG together with TBP CAG / CAA and the association of HD AO with HTT expanded CAG was 0.001 ( ∆ R 2 ). Therefore, we found a weak association (0.1%) of TBP CAG/CAA repeats on HD AO, if any.

Is There Any In uence of TBP CAG / CAA Repeats on Huntington's Disease Age at Onset? 1

. Introduction
Huntington's disease (HD) (OMIM: 143100) is a genetic, neurodegenerative, progressive, and fatal disease, characterized by motor disorder, cognitive impairment and behavioral problems. It is caused by a CAG trinucleotide repeat expansion in the rst exon of the HTT gene (GENE ID: 3064) located on chromosome 4p16.3 (Group 1993).
It is known that the most critical determinant of HD age at onset (AO) is the number of CAG repeats in the HTT gene which accounts for about 70% of the AO (Djousse et al. 2003; Rubinsztein et al. 1997;Wexler et al. 2004). The other 30% are assigned to the modi er genes and/or environmental factors (Rubinsztein et al. 1997).
A disease modi er gene is that gene whose structure, or expression, alters the expression of phenotypes associated with the primary mutation that causes the disease. The main strategy used to search modi er genes has been the investigation of genes linked to metabolic processes or to molecular pathways allegedly involved in HD (Gusella and MacDonald 2009).
The importance of this article resides in the fact that there are two main reasons for searching modi er genes in association with Huntington's disease. Knowing the genetic modi ers and how they act would provide a better understanding of the disease, as well as a better genetic counseling. The knowledge about genetic modi er genes is also important for selecting a more homogeneous population, aiming to allow better clinical trial for testing a candidate therapeutic drug (Gusella et al. 2014).
The TBP gene (Gene ID: 6908) is a supposed modi er gene which encodes for a TATA box binding protein. This protein is a transcriptional factor required for transcription initiation. The aim of this study was to investigate the in uence of a candidate modi er gene (TBP) in conjunction with the HTT gene on the age at onset of the HD symptoms in Brazilian individuals.

Individuals
The subjects were selected from the Clinical Genetics Service held at the University Hospital Gaffrée and Guinle (HUGG/UNIRIO) and from families enrolled in the Brazilian Huntington's Association -Brazil. They were all Brazilian by birth. One hundred and four individuals from 19 unrelated families were investigated: 51 women and 53 men. All subjects were from the following Brazilian states: 48% from Rio de Janeiro, 38% from Minas Gerais, 9% from Espírito Santo, 2% from São Paulo, 1% from Bahia, 1% from Pará and 1% from Maranhão.
The AO of the disease (when the manifestation of motor symptoms began) was self-reported by the affected individual, by his/her caregiver, or by the family. All the participants in this investigation signed the "Informed Consent" form. This study was approved by the HUGG Research Ethics Committee, Rio de Janeiro, Brazil, under the number CAAE 26387113.1.0000.5258.

DNA extraction
DNA samples were obtained from 1-3 mL peripheral blood (in EDTA tube), swabs or scrapings from oral mucosa.
DNA extraction was performed according to the extraction kit protocol (Illustra Blood Genomic Prep Mini Spin, GE Healthcare, Buckinghamshire, UK).

Analysis of HTT CAG region
The number of CAG repeats in the HTT gene was determined according to the protocol suggested by Agostinho et al (2012) (Agostinho Lde et al. 2012). The following primers were used: HD1 (forward, 6 FAM 5'-TGGCGACCCTGGAAAAGCTGAT-3') and HD3 (reverse, 5'-GCGGTGGCGGCTGTTGCTGCT-3') at the concentration of 10 ρmoles/uL. The mixture for PCR was prepared with 1uL of each primer, plus 6.25 uL of GoTaq® Green Master Mix (containing 1.5 mM MgCl 2 and 200 uM of each dNTP) (Promega Wisconsin, USA), and 4.25 uL of DNA (20-100 ηg/uL); at a nal volume of 12.5 uL. The conditions for PCR were: 1 cycle at 94°C for 5 min; and 35 cycles of 94°C for 1 min, 59.1ºC for 1 min and 72°C for 2 min; followed by a nal cycle of 72ºC for 50 min. It is important to mention that some samples had the CAG region sequenced, as suggested by Andrew et al. (1994) (Andrew et al. 1994) and were used as size standards for the fragment analysis. In order to validate this assay, three samples had their respective CAG / CAA regions sequenced. For that, the following primers were used: forward (5'-AGCCAGCCTAACCTGTTTTTC-3') and reverse (5'-TGCGGTACAATCCCAGAACT-3'). The sequenced fragments were used as size standards for fragment analysis.

Results
Out of 104 subjects, 72 showed positive molecular result for HD (≥ 36 CAG in the HTT gene) and 32 showed negative molecular results (<36 CAG repeats).
Sixty four of the 72 molecularly positive individuals reported their HD AO, which ranged from 18 to 67 years (mean 42 ± 10).

Number of HTT CAG repeats
Among the 32 subjects diagnosed as molecularly negative for HD, 26 subjects were heterozygous and six homozygous for the CAG region. The number of repeats in this group of patients ranged from 12-30 repetitions (mean: 19 ± 4; median: 17). All homozygotes showed 17 CAG repeats in both alleles.
There was no statistically signi cant difference between the number of TBP CAG / CAA repeats of the HD affected (expanded HTT) and non-affected individuals (normal HTT).
The sequence of the TBP gene was determined in three samples by direct sequencing (Sanger method). The TBP structures were different when compared with the wild-type allele according to reference GCh37.p13 found in the NCBI database, as well as when compared with the basic structure proposed by Gostout (1993) (Gostout et al. 1993), who categorized the TBP structures according to ve regions: I, II, III, IV and V ( Table 2).
We found four TBP different structures in our Brazilian sample: two HD patients had structures number 8, 9 and 10 and one individual who bore an intermediate HTT allele (with 30 uninterrupted CAG repeats) had the structure number 11 ( Table 2).
One HD patient harboring 48 HTT CAG repeats was homozygous for the TBP structure shown as number 9 (Table 2). Another HD patient with 48 HTT CAGs had two different TBP structures: represented by number 8 (normal allele) and number 10 (expanded allele).

In uence of TBP CAG / CAA repeats associated with the HTT CAG repeats on the HD AO
The HTT expanded CAG and TBP CAG/CAA repeat sizes jointly explained 66% of the AO in our HD patients (table 3). The strongest variable in the model associated to AO was the number of expanded HTT CAG repeats. Table 3 shows the in uence of HTT expanded CAG repeats together with TBP CAG / CAA repeats on the HD AO. The difference between the association of HD AO with HTT expanded CAG together with TBP CAG / CAA and the association of HD AO with HTT expanded CAG was 0.001 (∆R 2 ). Therefore, we found a weak association (0.1%) of TBP CAG/CAA repeats on HD AO, if any.

Discussion
Although the expanded number of HTT CAG repeats is considered to be the major determinant of the AO of HD symptoms, other factors act together with the HTT CAG expansions. These factors can be stochastic, environmental or genetic. Although all these three factors may be involved to some extent in the determination of the AO, the genetic factors can be more easily studied due to the advances in A study shows that increased FAN1 expression of a nuclease involved in DNA interstrand cross-link repair, is signi cantly associated with delayed AO and slower progression of HD, suggesting FAN1 is protective in the context of an expanded HTT CAG repeat (Goold et al. 2019).
In this investigation, we found that expanded HTT alone is responsible for 65.7% of the AO. Furthermore, other studies have shown that other mechanisms can also be involved such as epigenetic-chromatin deregulation, as well as RNA toxicity and transcription aberrations (Marti 2016;Nalavade et al. 2013). The microtubule associated protein tau (MAPT), which is involved in several neurodegenerative disorders, has also been implicated in HD (Vuono et al. 2015). It is also important to mention that the expanded huntingtin (mHTT) shows a pleiotropic effect, as it is broadly present in different cellular compartments (e.g. cytosol, nucleus, mitochondria) as well as in all cell types of the human body at all developmental stages (Bassi et al. 2017).
In addition, we searched the number of TBP CAG / CAA repeats and its in uence on the AO of motor symptoms in 72 HD patients. TBP has been suggested as a protein that plays an important role in the HD pathogenesis. The normal form of TBP has been found, along with the mHTT, at higher levels in the brains of HD patients than in control group (van Roon-Mom et al. 2002). We found 84% of TBP heterozygosity among individuals who were molecularly negative for HD and 81% for molecularly positive patients. A similar result described 78.5% of heterozygosity for the TBP alleles (Tomiuk et al. 2007).
In one of the rst studies to investigate the allelic frequency of TBP CAG / CAA repeats showed that, in Algeria, these repeats varied from 32-39, and the most frequent alleles had 38 repeats. Black South Africans and Sub-Saharan Africans had alleles with 33-39 repeats, and their most frequent allele had 35 repeats. Indian people had CAG / CAA repeats ranging from 27 to 39 and their most frequent allele had 38 repeats (Rubinsztein et al. 1996). We also investigated the same in uence of TBP on the HD AO and observed a similar weak association (∆R 2 = 0.1%). It is important to mention that these two searches are comprised of Brazilian or Tunisian individuals who have been living in different environments and exposed to different mechanisms of selection. Furthermore, the two studies also differ in the sample size: a small sample size (n=15) (Hmida-Ben Brahim et al. 2014) and our sample of 144 TBP alleles (n=72 HD patients) ( Table 3).
The normal number of TBP CAG / CAA repeats was described as modulator of the AO of spinocerebellar ataxia 7 (SCA7), with evidence that the largest number of CAG / CAA repeats leads to a decrease in the age of SCA7 onset. This ataxia is caused by ATXN7 gene which also codes for a track of glutamine repeats (Tezenas du Montcel et al. 2014). Furthermore, a study indicates that genotypes with more than 35 TBP CAG / CAA repeats have been associated with risk of schizophrenia, AO of the disease and prefrontal cortical function in the Japanese population (Ohi et al. 2009).
It is worth mentioning that the rst polymorphic region of TBP structure (Region II) shows sequences with different CAG numbers from 6 to 47 (table 2). The NCBI reference (GRCh37.p13) only reports the region II as a sequence of eight repeats (CAG) 8 and the CAG variations were reported as Inframe deletions; however they do not have clinical signi cance presented by ClinVar database.
We suggest that subjects bearing TBP structure 7.2 could have two SNPs (rs55736770 and rs62430309) in region III which would lead to uninterrupted CAG track with 45 to 47 repeats. Although considered as a region of higher instability (Imbert et al. 1994), variant rs55736770 is reported by ClinVar database as likely-benign. In addition, the clinical signi cance of rs62430309 is not reported in ClinVar database.
The alternative sequence (*1) localized between regions III and IV (rs112083427), present in 7.2 structure, has a mutation (CAG CA(G>A) CAG 17 ) which is a variant classi ed as VUS by ClinVar. This TBP region has a polulational frequency of 24.9% in Europe, 23.4% in East Asia, 12% in Africa and 34% in the American Continent according to the Genome Aggregation Database (gnomAD).
The second polymorphic TBP region IV, according to GRCh37.p13, may vary from 10 to 24 repeats with benign or likely-benign clinical signi cance. Imbert et al (1994) (Imbert et al. 1994) consider normal TBP alleles with a maximum of 21 triplets at region IV. Table 2 shows that the second polymorphic region has a variation of 9 to 31 CAGs but structure 7.2 which has a expansion >40 CAGs. Our patient, who bore the TBP structure 8, had 17 uninterrupted CAGs between regions I and V. The variants which have more than 24 CAG repeats are not reported in the NCBI database.
The alternative region (*2), located between regions IV and V, was only described in a single study (structure number 3.2-table 2) where ve G>A led to a CAG sequence interrupted by ve CAA (synonymous variant). As a consequence of its presence, region IV has a smaller number of CAG repeats (9 repeats). Although, the variant sequence is reported in the NCBI database it does not have any clinical signi cance in ClinVar database. Its frequency is less than 1% in the world population (1000 Genome Bank).
Considering TBP as candidate to be a modi er gene, neither the TBP variants observed in the Brazilian HD patients, nor those reported by Goustout et al. (1993) (Gostout et al. 1993), are available in the ClinVar database. Consequently, we could not compare these different structures in relation to their clinical signi cance.
In conclusion, the rate-limiting mechanisms of AO in HD still remain elusive: many different processes are commonly disrupted in HD cell lines and animal models, as well as in HD patient cells (Bassi et al. 2017).
It would be important if future searches further investigated the association of the size of TBP CAG/CAA expansions with or without CAA in the CAG polymorphic region of this gene, once the long CAG expansions have already been associated with loss of genetic stability.