- Genetic variants associated with the increased risk of BC in the population under study
Out of the fifteen variants shortlisted four variants namely rs1051266, rs12190287, rs2229080, and rs2298881 in SLC19A1, TCF21, DCC, and ERCC1 genes respectively, were found to be significantly associated with BC in the studied cohort. The variant rs1051266 is located in the second exon of the gene SLC19A1 and is a missense variant. The variant shows a significant association with BC with OR 1.745 (1.321-2.304 at 95% CI) and has a p value=7.98E-05 (Allelic). A significant association of the variant was also observed in the dominant model with OR 3.461 (2.136-5.609 at 95% CI) and p value=0.000000466 (Table 1). The variant was providing risk for BC in the studied cohort. The variant rs12190287 is a 3’UTR variant located in the third exon of the gene TCF21. The allelic association of the variant showed a weak association with BC and OR observed was 1.306 (0.995-1.713 at 95% CI) having p-value=0.0491 (Allelic) to observe the maximum effect of allele C, the dominant model was evaluated. Interestingly, the OR observed was 1.713 (1.08-2.716 at 95% CI) and p value=0.022 (Table S2). The variant was providing risk for BC in the dominant model in the studied cohort. The variant rs2298881 is located at the intron of the ERCC1 gene. Variant rs2298881 was found significantly associated with BC and the OR observed was 0.6981 (0.36-0.71 at 95% CI) and p value=0.01169 (Allelic). A significant association was observed with the additive model with OR 0.669 (0.46-0.973 at 95% CI), p-value=0.035 (Table S2). The variant rs2229080 is a missense variant, located on the third exon of the DCC gene. The variant rs2229080 showed protection against BC having OR 0.6867 (0.5123-0.9205 at 95% CI) and p value=0.011 (Allelic). However non-significant association was observed for rs2229080 using the dominant model: OR 0.797 (0.517-1.229 at 95% CI), p-value=0.305, recessive model: OR 0.505 (0.252-1.014 at 95% CI), p value=0.05 and Additive model: OR 0.758 (0.551-1.042 at 95% p-value = 0.088 (Table S2).
Table 1: Logistic regression analysis of the variants in the studied population group
S.No.
|
GENE
|
SNP
|
CASES
|
CONTROL
|
H.W.E.
|
OR AT 95% CI
|
p-VALUE
|
1
|
TCF21
|
rs12190287
|
C=0.45
|
C=0.39
|
0.456
|
1.306 (0.995-1.713)
|
0.0491
|
G=0.55
|
G=0.61
|
2
|
DCC
|
rs2229080
|
C=0.38
|
C=0.47
|
0.111
|
0.6867 (0.5123-0.9205)
|
0.01169
|
G=0.62
|
G=0.53
|
3
|
SLC19A1
|
rs1051266
|
T=0.46
|
T=0.33
|
0.053
|
1.745(1.321-2.304)
|
7.98E-05
|
C=0.54
|
C=0.66
|
4
|
ERCC1
|
rs2298881
|
A=0.21
|
A=0.27
|
0.124
|
0.6981 (0.36-0.71)
|
0.03043
|
C=0.79
|
C=0.73
|
- Genetic variants not associated with BC in the population under study
The intronic variant rs249954 on the PALB2 gene has been found associated with the breast cancer risk in Chinese population, however has no association with breast cancer in the population under study (23). The variant rs664677 in the DNA damage response gene ATM though not associated in our population, previously has been reported to be associated with breast cancer and lung cancer risk in Asian people (24). Another variant rs2981582, in the FGFR2 gene, which was not associated in the population under study, was found associated with breast cancer in Asian and Caucasian population groups (25). The gene SLC4A7 having the rs4973768 although not significantly associated with breast cancer in our population previously has been found associated with increased risk of breast cancer in Chinese population (26). The variant rs2363956 of the ANKLE-1 gene has been previously reported to be associated with breast cancer risk in Chinese population(27), however its role in the population under study remains ambiguous since only variants with call rate over 90 percent were acknowledged in this study. The gene CYP19A1 with the variant rs10046 has previously been reported to be associated with estradiol levels and postmenopausal breast cancer in European population (28). The 2 variants in TERT gene, rs2736100 and rs2735940, have been reported to be associated with multiple cancers (including breast etc.) in Asian and Caucasian population (29) and lung cancer risk (30) especially in Caucasian population, respectively. The variant rs2975843 in TERF1 gene was not found associated with breast cancer in the population under study however has been reported to be associated with colorectal cancer in European descent population (31). Studies have shown the BRIP1 gene variant rs4986764 associated with breast cancer in Chinese population (32), however no significant association was found in the studied population. The variant rs3792152 in REV1 gene, not associated with population under study, has been previously reported to be associated with the development of epithelial ovarian cancer in European population (33).
- Prediction of mRNA secondary structure
MFE (Minimum free energy) secondary structure and the centroid secondary structure of the variants were studied (Figure 1). The secondary RNA secondary structures of rs2229080, rs10,51266, and rs2298881 polymorphisms revealed a slight variation in the energy of the wild type allele in comparison to the variant type. The variant rs1051266 had MFE of -461.1 (Kcal/Mol) for the ancestral allele C, however, we observed an elevation in the MFE for the altered allele T being at -459.6 (Kcal/Mol). Also, there was an increase in the MFE of centroid structure for this variant from -372.57(Kcal/Mol) for allele C to – 372.57 (Kcal/Mol). For the variant rs12190287, no change was observed in the MFE value of the ancestral (G) and altered allele (C) structures, although there was an increase in the MFE of the centroid structure from -234.28 (G) to -222.46 (C) (Kcal/Mol). The variant rs2298881 was showing a decrease in the MFE from -372.3(Kcal/Mol) for ancestral allele C to –373.5(Kcal/Mol) for altered allele A. The MFE of the centroid structure was also found to be less for the wild allele C at -277.15(Kcal/Mol) than the altered allele A at -308.95 (Kcal/Mol). A significant change in the MFE value was observed for the variant rs2229080, the wild allele G being at -228.6 (Kcal/Mol), and 231.1(Kcal/Mol) for the altered allele C. Their centroid structure MFE values also followed the same pattern and varied from being -141.7 (Kcal/Mol) to -164.3 (Kcal/Mol) for the wild and altered allele respectively. The decreased free energy of the wild allele correlates with increased stability in the structure. The observed decrease in MFE of the wild type allele for the variants rs1051266 and the MFE of the centroid structure for the variant rs12190287 corresponds with the increased stability of the wild type allele in both the cases. The altered allele of both the variants has been observed to risk causing. The decrease in the stability of the altered structure might be a potent factor posing the risk threat. The altered alleles of the variants rs2298881 and rs2229080 have a low MFE than the wild allele and have been shown for protecting the BC. The low MFE of the altered alleles confers greater stability to these structures than the wild alleles with a higher MFE value. The free energy values have been summarized in the Table S4.
- Network Analysis
The potential involvement of DCC1, ERCC1, SLC19A1, and TCF21 relevant genes by querying the genes in the GeneMANIA(20) (figure 2). This showed that the expression of the genes is correlated with that of DCC. Further network analysis revealed that the DCC gene displays protein‐protein interaction with CASP9 that is associated with multiple cancer risks (34). An interaction, however meek, is seen among the DCC interacting proteins and TCF21. No associated network of interacting proteins was found interacting with SLC19A1.