Cattle male reproductive tissue sample collection and RLM-RACE assay to amplify BBD129 mRNA
Sample collection and processing: The matured male reproductive tract (MRT) of Indian cattle (n=3 biological replicates, and age 3 to 5 years old) were collected from the Kolkata cattle abattoir, Kolkata, India. MRT samples were collected and washed with 1X PBS saline buffer solution and to maintain RNA quality, samples were frozen in the dry ice within 10-15 min of slaughter. In the laboratory, MRT tissues viz. Rete testes (RT), seminiferous tubule (ST), caput, corpus, cauda, and vas-deferens (supplementary fig 1) were dissected with a sterile scalpel into the 3 to 5 mm size pieces under the sterile condition and dipped into RNA later solution (Sigma, USA) and stored into the liquid nitrogen for further experimentations.
RNA extraction and DNase I treatment
Based on prior literature information [23, 25], the caudal tissues were selected for the total RNA extraction in cross-bred cattle. Total RNA extraction was done by TRIzol (Qiazol, USA) methodology (supplementary file 1). To remove genomic DNA contamination, DNase I (ThermoFisher, catalog # 18047019) treatment was done as per company manufactured protocol. RNA quantification was done by NanoDrop ND-1000 spectrophotometer (Thermo NanoDrop Technologies, USA). The absorbance ratio A260/280 and A260/230 were approximate 2.0 for samples used in this work (supplementary fig 2). The quality of the total RNA was analyzed on non-denaturing (1.2%) agarose (Lonza, Switzerland).
RLM-RACE primer designing strategy, cDNA synthesis and RACE PCR
To amplify the cattle BBD129 mRNA sequence, two sets of primers were designed from the coding region of the BBD129 gene as suggested by the company procedure. The BBD129 mRNA sequence of Bos taurus cattle was retrieved from the ENSEMBL database (transcript ID: DEFB129-201 ENSBTAT00000064705.2). All primers were designed by using NCBI primer-Blast tool [39] and the primers parameters were checked by Oligocal [40]. The sequences of primers are given in supplementary table 1.
RLM-RACE kit (Ambion, catalog #1700, USA) was used for the amplification of 5’ and 3’ends of BBD129 mRNA and the procedure has been described in supplementary file 1. Reverse transcription: SuperScript™ IV First-Strand Synthesis kit (ThermoFisher Scientific, catalog #18091050, USA) was used for cDNA synthesis as per the company manufacturer’s protocol. The 5’ adapter-ligated RNA (10µl) was used as a template and negative Tobacco Acid Pyrophosphatase control and non-template control (NTC) were run to check any gDNA contamination. Briefly, 200 ng of total RNA was mixed with random hexamer & oligo_(dT)n primers (1µl each), and DEPC-treated water into a sterile 200 µl PCR tube followed by incubation at 65°C for 5 minutes and immediate ice incubation for 2 minutes. After that, the 5X SuperScript IV reaction buffer, RNase inhibitor, dNTP mix, and RNase inhibitor, SuperScript IV reverse transcriptase was added to make final volume 20 μl PCR reaction followed by incubation at 55°C for 30 min in a PCR thermal cycler (BioMetra T-Professional Basic Gradient PCR Machine, Germany). The reverse transcriptase was inactivated as per the kit suggested protocol (heating at 90°C for 10 min). 5’ RLM-RACE PCR: The 5’ RLM-RACE outer and inner PCR were performed using primers sets mentioned in supplementary table 1. A 50µl of PCR reaction composition was consisting a 5’ adapter-ligated cDNA template, 5X Phusion HF buffer, dNTPs, primers, Phusion DNA polymerase, and nuclease-free water. The PCR thermal profile was: initial denaturation (98°C, for 30 sec) denaturation (98°C for 30 sec), annealing (63°C for 30 sec) and extension (72°C for 45 seconds) followed by 35 cycles and a final extension at 72°C for 10 minutes. Two negative control reactions viz. negative Tobacco Acid Pyrophosphatase treated cDNA and non-template reaction were run to confirm full-length RNA integrity and gDNA contamination, respectively. The amplified products were run on 2% agarose gel. 3’ RACE PCR: Total RNA extraction and reverse transcription reaction were done as described above. The 3’ RACE adapter (5'-GCGAGCACAGAATTAATACGACTCACTATAGGT12VN-3') was used as primer in reverse transcription. The 3’ RLM-RACE outer and inner PCR were performed similarly to the 5’ RLM-RACE PCR. One cDNA PCR was also run with cDNA primers with PCR reaction composition and thermal profile as mentioned in 5’ RACE PCR for amplification of the internal region of BBD129 mRNA. The list of primers used in this study is given in supplementary table 1.
Cloning and sequencing of amplified BBD129 mRNA.
Cloning: The cloning was done by using CloneJET 1.2 kit (ThermoFisher Scientific, #K1231). Ligation reactions of 20 µl volume was consisting pJET1.2 blunt cloning vector, T4 DNA ligase, T4 DNA ligase buffer, PCR eluted 3’ and 5’ RLM-RACE products, and nuclease-free water. Reactions were mixed gently followed by incubation at room temperature (RT) for 1 hour and subsequently used for the bacterial transformation. Transformation: The competent XL-1 Blue strain of E.coli cells (50 µl) was taken from -80°C and thawed on ice for 15 minutes. A 10 µl of ligated reaction was added to competent XL-1 Blue E.coli cells and incubated on ice for 30 min after that cells were challenged for heat transformation at 48°C in water-bath for 90 sec followed by immediate ice incubation for 10 min. A 900 µl of Lysogeny broth (LB) media (Hi-Media, India) and 100 µl of 1M glucose (Sigma-Aldrich, USA) was added to the tube and incubated at 37°C for 1 hour in a shaker incubator followed by plating on LB agar medium containing ampicillin antibiotic (50 µg/ml) and plates were incubated at 37°C for overnight. Colony PCR confirmation: Overnight grown colonies were processed for colony PCR. The colony PCR reaction compositions and thermal profile were similar as mentioned in the RLM-RACE PCR. The positive clones processed further for plasmid isolation. Plasmid isolation and plasmid PCR: It was performed manually by the alkaline lysis method as mentioned in supplementary file 1. The quantity and quality of isolated plasmids were evaluated by NanoDrop ND1000 spectrophotometer and 1% agarose gel electrophoresis (AGE).
Characterization of cattle BBD129 gene and phylogenetic analysis
Genomic characterization: Sequencing result were BLAST (BLASTn and BLASTp) against cattle genome (Bos taurus x Bos indicus) on NCBI (UOA_Brahman_1 GCF_003369695.1) and USSC-Genome Browser (Apr. 2018 ARS-UCD1.2/bosTau9). Cysteine bond pairing prediction was done by DiANNA Web server (41). Secondary structure prediction was done by PSIPRED and SOPMA server [42-43]. Signal peptide prediction was done by Signal IP 5.0 server [44]. Biological function prediction was done by Argot2 server [45]. The complete sequence of cattle BBD129 gene was submitted on NCBI-BankIt [46].
Evolutionary phylogenetic tree analysis: The protein sequences of the DEFB129 gene from ruminants and non-ruminants mammalian species were retrieved from the NCBI database by using the “beta-defensin 129” keyword (supplementary file 2). To find the evolutionary relationships between cattle BBD129 (Bovidae family) identified by RLM-RACE and its neighbor a family, a phylogenetic evolutionary analysis was performed by the Maximum Likelihood method and Tamura-Nei model by using the MEGA 10 bioinformatics tool [47]. This analysis involved 29 protein sequences of DEFB129 genes. The bootstrap consensus tree inferred from 1000 replicates was taken to represent the evolutionary history of the taxa analyzed.
Amplification of BBD129 gene from sperm gDNA
Collection of frozen semen and sperm genomic DNA extraction
Sample collection: Eleven distinct fertility cattle bulls were selected on the basis of their first and second artificial inseminations conception rate (CR) data available at the Artificial Breeding Research Center, NDRI, India. Cattle bulls with CR less than 31% were classified in the low fertile group and bulls with CR with more than 50% were classified in the high fertile group (Table 1). Frozen semen samples of selected cattle bulls of distinct fertility (5 bulls from high and six from low) were collected into liquid nitrogen. Genomic DNA isolation: Frozen semen samples were thawed by immersing them into the pre-warmed 37˚C distills water for 30 sec and contents were collected into the 15ml falcon tubes containing 8ml of washing buffer-A (150mM NaCl and 10mM EDTA) and centrifuged at 800g for 10 min. Pellets were dissolved into the 300 µl of lysis buffer-B (100mM TRIS-Cl, 10mM EDTA, 0.5M NaCl and 1% SDS) and 100µl of 1M DTT (Dithiothreitol) followed by incubation for 15 min at RT. After that, 100µl of Proteinase-K enzyme (0.2 mg/ml) was added to the tube followed by overnight incubation at 55ºC in the dry bath. The sperm gDNA was isolated by the Phenol:Chloroform: Isoamyl alcohol (PCI) method [48]. The quality and quantity of gDNA were measured by NanoDrop ND1000 spectrophotometer and 1% AGE.
DEFB129 genomic DNA amplification and cloning
Two sets of primers were designed by using the NCBI primer tool to amplify both the exons of the BBD129 gene. PCR reactions of 50µl contained 2µl of each primers [exon 1 (Forward = 5’GAGGTCTTCCTTCTTGTTACACTGA3’, Reverse = 5’AACCCAGATCCAAGGCTTCTT-C3’) and BBD129 exon 2 (Forward = 5’CCATCAACGTC-CCTAACCACT3’, Reverse = 5’CT-GTGAGGCTCCTGTGAGAAA3’)], dNTPs, Phusion buffer 5X, Phusion high fidelity polymerase, genomic DNA template and remaining nuclease-free water. PCR thermal reaction profile was: initial denaturation at 98°C for 5 min, denaturation 98°C for 30 sec, annealing 58.2°C for BBD129 exon 1 and 60°C for BBD129 exon 2, and polymerization at 72°C for 40 sec. The 35 cycles were run followed by a final extension at 72°C for 10 min. NTC was included as a negative control to check nucleic acid contaminations. Bacterial cloning and PCR was done as mentioned above (Cloning and sequencing of amplified BBD129 mRNA section) with the following modifications: exon 1 primer (Forward = 5’GAGGTCTTCCTTCTTGTTACACTGA3’, Reverse = 5’AACCCAGATCCAAGGCTTCTTC3’), exon 2 primers (Forward = 5’CCATCAACGTCCCTAACCACT3’, Reverse = 5’CTGTGAGGCTCCTGTGAGAAA3’) and annealing (58.2°C for BBD129 exon 1 and 60°C for BBD129 exon 2). Positive plasmids were processed for sequencing.
Absolute quantification of BBD129 gene copy number
The protocols and procedures used for the optimization of primer concentration, gDNA template, and generation of the standard curve are provided in supplementary file 1. To determine the absolute copy number of BBD129 gene in genomic DNA of distinct fertility cattle bulls, the RT-qPCR reaction was set up with the following components and thermal profile: SYBR (5 µl), forward primer (1.5µl), reverse primer (1.5 µl), nuclease-free water, and gDNA template (2µl of 1.25ng/µl). The thermal profile was: initial denaturation (95°C for 3 min), denaturation (95°C for 10 sec), annealing (49°C for 15 sec), extension (72°C for 15 min), and repeated run for 35 cycles. After the amplification, a melting peak analysis with a temperature gradient of 0.5°C/sec from 65°C to 95°C was performed.
The expressional analysis of Indian cattle BBD129 by RT-qPCR assay
Reverse transcription and Primer designing
Total RNA extraction from frozen cattle MRT tissues and cDNA synthesis was done as mentioned above (RNA extraction & DNase I treatment and RLM-RACE primer designing strategy, cDNA synthesis, and RACE PCR sections). The primers were designed from our RLM-RACE BBD129 product (supplementary table 1). GAPDH (glyceraldehyde-3-phosphate dehydrogenase) and eEF-2 (eukaryotic elongation factor 2) reference genes were chosen based on the literature [18, 49]. RT-qPCR optimization and validation of reference genes are provided in supplementary file 1.
Annealing temperature (AT) optimization was done by the conventional gradient PCR. The best AT was selected based on the brightest and sharpest band obtained on 2% agarose gel. PCR chemical composition involves cDNA template, dNTPs, gene-specific primers, Taq DNA polymerase, reaction buffer, and Nuclease-free water in a 25µl reaction volume. The PCR thermal profile was: Initial denaturation at 95°C for 5 minutes, denaturation at 95°C for 30 seconds, annealing-variable between 45-55°C C for 30 seconds, extension at 72°C for 15 seconds and a final extension for 5 minutes at 72°C.
The selection of best primer concentration was done by RT-qPCR assay with the following reaction components: SYBR (5 µl), forward primer (0.25µl/0.5µl/1µl/1.5µl), reverse primer (0.25µl/0.5µl/1µl/1.5µl), nuclease-free water, and cDNA template (2µl of 10ng/µl). The following RT-qPCR thermal profile was used: initial denaturation (95°C for 3 min), denaturation (95°C for 10 sec), annealing (49°C for 15 sec, extension (72°C for 15 min), and 35 cycles were run on Bio-Rad CFX96 Maestro qPCR machine by using SYBR Green Master Mix (Bio-Rad, USA). After the amplification, a melting peak analysis with a temperature gradient of 0.5°C/sec from 65°C to 95°C was performed to ensure specificity and primer-dimer formation. RT-qPCR reaction efficiency was done by using a serial dilution by using BBD129 gene-specific primers for the MRT tissues. RT-qPCR efficiency was calculated according to the E = 10(-1/slope) [50].
Cattle BBD129 RT-qPCR assay
The relative expression of the cattle BBD129 gene was done as duplicate in a 10 µl reaction volume with RT-qPCR thermal profile: initial denaturation (95°C for 3 min), denaturation (95°C for 10 sec), annealing (49°C for 15 sec, extension (72°C for 15 min), and 35 cycles were run. After the amplification, a melting peak analysis with a temperature gradient of 0.5°C/sec from 65°C to 95°C was performed to ensure specificity and primer-dimer formation. A non-template control was run to cross-confirmation of reaction component nucleic acid contamination.
Bioinformatics analysis of cattle BBD129 polymorphism
BLASTn and BLASTp alignment analysis was used for the nucleotide and protein search alignment, respectively [51].
Prediction of functional impacts of nsSNPs on mRNA secondary structure
To predict mRNA secondary structure, FASTA formats of native and mutated mRNA were submitted to the RNAfold server [52]. It provides three structural information i) conventional secondary structure graph, ii) Dot plot, iii) Mountain plot. A mountain plot is an x-y graph and shows minimum free energy structural curve, centroid structural curve, partition function curve, and positional entropy resulting from the pair probabilities.
Prediction of functional impacts of nsSNPs on protein stability and functionalities
PROVEN (Protein Variation Effect Analyzer) [53], SIFT (Sorting Intolerant from Tolerant) [54], Polyphen-I (Polymorphism Phenotyping-I) & Polyphen-2 [29], SNPs & GO [55], and PhD-SNP [56] are sequenced and SVM based bio-computational tools used for prediction of the impact of observed amino acid substitutions (S57A & N110S) on BBD129 protein structural information and functions. The impact may neutral (tolerant) and disease-causing (non-tolerant) depending on the alignment and physicochemical properties of amino acids. Polyphen algorithm score denotes substitution benign if the score is <0.4; possibly damaging (0.4-0.9), and probably damaging (>0.9). SIFT alignment score denotes deleterious (<0.05) and neutral (>0.05). PROVEN algorithm score denotes deleterious (<−2.5) and neutral (>−2.5) [57]; SNAP2 predict the disease-related variants [58]; I-Mutant2.0 predict protein stability depending on SVM and ProTherm databases, helped to evaluate AAS free energy changes values (DDG = DGmutant – DG wild type) [56, 59]; MAPP (multivariate analysis of protein polymorphisms) was used for the interpretation of missense variant causing deleterious effect [60]. PSIPRED and SOPMA servers were used for the prediction of the impact of nsSNPs on protein secondary structure. Biological function prediction was done by the Argot2 server [45]. The translated amino acid sequences of native and mutated BBD129 gene were submitted above servers.
Prediction of impact of snSNPs on protein physiochemical properties and post-translational modifications
ProtParam [61] and Mupro [62] bioinformatics tools were used to computes various physicochemical properties such as molecular weight, theoretical pI, amino-acids composition, atomic compositions, aliphatic index, instability, extinction coefficient, protein stability, and estimated half-life. Post-translational modifications: i) O-glycosylation site prediction was done by using NetOGlyc 4 server [63] and GlycoEP standard predictor under the composition profile of pattern [64]. ii) N-glycosylation site prediction was done by using NetNGlyc 1.0 server [65] which is ANN (artificial neural networks) based that examine the sequence motif of Asn-Xaa-Ser/Thr amino-acid sequence. iii) Phosphorylation site prediction was done by using NetPhos 3.1 server [66] which works based on neural networks. The translated amino acid sequences of native and mutated BBD129 gene were submitted above servers.
Statistical analysis
GraphPad Prism 5.0 [67] unpaired T-test was used for the analysis of distributions of SNPs and absolute quantification of copy number variations in the distinct fertility bulls. A P-value <0.05 was set to be statistically significant. The geometrical mean sample Cq (cycle of quantification) values for cross-bred BBD129 transcript in triplicate samples and relative expression ratio of BBD129 gene expression was computed using the 2-∆∆Ct (Threshold cycle) method [68] and implemented by the GraphPad Prism 5.0 software. GAPDH and eEF2 were used as reference genes in the expression experiments. The BBD129 gene expression in the seminiferous tubule region of MRT was used to normalize Cq values obtained from other MRT regions. The differential BBD129 gene expression level was analyzed by ANOVA (Tukey posthoc correction test) in GraphPad Prism 5.0.
Complete coding BBD129 gene in Indian cattle
RLM-RACE was performed on the corpus-testicular RNA for the complete amplification of bovine BBD129 mRNA. The 5’ RLM-RACE inner PCR had given a band of approximately 190bp long product while 3’ RACE PCR had given an approximately 300 bp long amplification product (supplementary fig 3). The BBD129 5’, 3’, and cDNA PCR sequences obtained were aligned to each other to generate a complete coding sequence of BBD129 mRNA. The FASTA sequences of amplified produced are provided in supplementary file 3. The BBD129 gene is located on chromosome 13 between the nucleotide coordinates chr13: 22660780-22799850 (NCBI: UOA_Brahman_1 GCF_003369695.1 and USSC: Apr. 2018 ARS-UCD1.2/bosTau9) and flanked by the LOC11390270 gene on upstream and the C13H20orf96 gene on the downstream region. BBD129 gene having two exons separated by a ~l.6 kb non-coding intron. The first exon code for the 34 bp long 5’ UTR and signal peptide while the second exon code for functional protein and 71 bp long 3’ UTR regions (Fig 1). Gene ORF starting from nucleotide 47th and end with 560th nucleotide carrying a 511bp long coding mRNA which codes for 171 amino acids long peptide protein. The detail of genomic characterization was submitted to NCBI with accession number 2448405 and is provided in figure 1 and supplementary table 1. Cattle BBD129 conserve cysteine pairing rule (C1-C5, C2-C4, and C3-C6) and secondary structure with a conserved N-terminal signal peptide of 18 amino acids. The beta-defensins have the properties to exogenously secretion and signal peptide cleavage to give mature functional protein. Bovine BBD129 protein bear protease cleavage site between the amino acids 19 and 20 viz. VNT-EY and has been predicted with a cleavage probability of 0.998 (supplementary fig 4). BBD129 was predicted for conserved three beta-sheets secondary structure, immune defensive roles, efficiency to bind with CCR6 chemokine receptor and lipopolysaccharides binding, and extensive post-translation modifications especially O-glycosylations and phosphorylations (fig 1 and supplementary table 1).
Figure 1
The multiple sequence alignment (MSA) and phylogenetic analyses of the BBD129 gene across the twenty-nine mammalian species indicate a strong evolutionary relationship between the Indian cattle BBD129 gene and the Bovidae family BBD129 (fig 2; supplementary fig 5). The Camelidae and Equidae families show relationships next to the Bovidae family. MSA analysis found Indian cattle are structurally similar to Bos taurus except for few amino acids.
Figure 2
Association of cattle BBD129 sequence variations with bull fertility
The BBD129 gene was amplified from sperm gDNA of distinct fertility bulls and the full length of exon first (507bp) and exon second (607bp) was achieved by amplification of intronic flanking regions of exons (Fig 3). To determine the polymorphisms in the BBD129 gene in the distinct fertility bulls, we sequenced 254 clones from 11 bulls. There was no polymorphism observed in the exon first. However, in exon second, we observed two missense or non-synonymous SNPs at positions 169th (T169G cDNA position, rs378737321) and 329th (A329G cDNA position, rs383285978) in the gDNA from distinct fertility bulls (Fig 4; Supplementary fig 6A). After translating nucleotide to amino acid sequence, we have also observed substitutions at the amino acids level. In the BBD129 protein, the serine amino acid at the 57th position is replaced by alanine (S57A/srs378737321) while asparagine amino acid is replaced by serine at the 110th position (N110S/rs383285978) (Fig 4; supplementary fig 6B). Based on polymorphism position and linkage, BBD129 haplotypes were categorized into four groups: TA haplotype (169T & 329A), GA haplotype (T169G polymorphism), TG haplotype (A329G polymorphism), and GG haplotype (when T169G & A329G polymorphisms present together). The frequencies distributions of haplotypes in the high fertile group (n=105 clones) were: TA (71.42%), GA (1.90%), TG (2.8%), and GG (24.76%), while in the case of low fertile group bulls, the frequencies distributions of observed haplotypes (n=149 clones) were: TA (36.24%), GA (0%), TG (2.68%), and GG (61.07%) (Fig 5; Table 1). There was no significant difference in TA haplotype distribution in the high fertile and low fertile groups (P=0.5256) but there was a 35.15% difference in BBD129 TA haplotype in both the group of high fertile and low fertile bulls. There was a significant difference in GG haplotypes frequency distribution in the low fertile bulls as compared to high fertile bulls (P=0.0002). The distribution of double mutated GG haplotype was found an association with bull’s low conception rate (table 1).
Table 1
Figure 3
Figure 4
Figure 5
Copy number variation in BBD129 gene in distinct fertility cattle bulls
In order to investigate the fertility-related association in the copy number of the bovine beta-defensin 129 (BBD129) gene, we examined the genomic copy number of the BBD129 gene in two groups of distinct fertility cross-bred bulls (n=11). Standard curve-based quantification of the BBD129 gene was carried out by absolute real-time quantitative PCR. The known concentration (copy number) of the plasmids carrying the gene of BBD129 fragment was used to construct the standard curve for the BBD129 gene by plotting the log concentration values against the crossing point (Cq) values (Supplementary fig 7 & 8). Crossing point (Cq) values were determined from the RT-qPCR run of equal concentration of genomic DNA (1.25 ng/ul) of distinct fertility categorized bulls. The universal bovine CSN2 (casein beta) gene was used as a single-copy number reference gene. The absolute Cq values and their log transformation for BBD129 gene and CSN2 control are provided in supplementary table 1. It was observed that the mean copy numbers of CSN2 and BBD129 genes did not vary significantly between bulls of distinct fertility (Fig 6) suggesting a single copy of BBD129 gene in the bovine genome and figure 10B suggesting no significant difference of BBD129 gene CNV in high fertile and low fertile bulls.
Figure 6:
Expression of beta-defensin BBD129 gene in male reproductive tract
The relative distribution of BBD129 expression was analyzed in adult cattle MRT. The expression dynamics of BBD129 were slightly different from the other mammalian species reported earlier, the higher expression of BBD129 was observed in the corpus segment of the epididymis as compared to others MRT tissues (viz. ST, RT, caput, cauda, and VD) (Fig 7). The corpus regions show 14.2 fold higher expression of the BBD129 gene than the normalizer rete testis region. The order of reducing expression of BBD129 expression was corpus, cauda, VD, RT, and caput region. The details of mean difference and P-value have provided in supplementary table 1.
Figure 7
The possible impact of nsSNPs on BBD129 protein
The possible impact of nsSNPs on mRNA structure
Polymorphism in the BBD129 gene can alter base-pairing probability in the mRNA secondary structure (Supplementary fig 9). The optimal mfe of native BBD129 mRNA (no polymorphism) was predicted at -93.77 kcal/mol while in double mutated BBD129 minimum free energy was -98.20 kcal/mol. The free energy of the native BBD129 thermodynamic ensemble was free -102.28 kcal/mol and the ensemble diversity value was 142.05 while in the case of the double mutated BBD129 thermodynamic ensemble free energy was -107.88 kcal/mol and ensemble diversity value was 119.47. The minimum free energy of native BBD129 centroid secondary structure was 46.40 kcal/mol while in double mutated the value was predicted 61.78 kcal/mol. The MFE structural curve, the thermodynamic ensemble of RNA structural curve, the centroid structural curve, and entropy changes in the mutated BBD129 mRNA can be seen in the mountain plot (Fig 8
Figure: 8
The possible impact of nsSNPs on protein stability
There are many tools used for the predictions of the deleterious effect of observed nsSNPs on BBD129 protein. Polyphen-I, PROVEN, SNPs & GO PhD-SNP, Meta-SNP, and PredictSNP have been predicted as the neutral and non-deleterious effects of observed SNPs on BBD129 protein. SNPs to be benign in both HumDIv and HumVar models analyzing. The deleterious effective score of observed rs378737321 SNP was 0.002 with 0.99 sensitivity and 0.30 specificity and the deleterious effective score of rs383285978 SNP was predicted as 0.004 with 0.97 sensitivity and 0.59 specificity (supplementary fig 10). Similarly, SIFT, SNAP, and MAPP predicted S57A mutation as deleterious or disease-causing polymorphism and N110S mutation as neutral. I-Mutant2.0 server predicted both nsSNPs as structure distorters. The result details of the above bioinformatics servers are provided in table 2.
Table 2
Possible impact of nsSNPs on physiochemical properties of BBD129 protein
Comparative ProtParam results between native BBD129 protein sequence and double mutated protein sequence have shown major alterations in the molecular weight, polar and non-polar amino acids, instability index, aliphatic index, and GRAVY (supplementary table 1). PSI-PRED and SOPMA servers predicted alterations in the protein secondary structure especially in the helix and coils (Fig 4). SOPMA predicted native protein sequence with 33.77 % alpha-helix, 1.32 % beta-turn, 10.60 % extended strand, and 54.30 % random coil, however, in the mutated BBD129 protein the alfa helix increase by 5.3 % while the beta-turn and random coil regions decrease about 1.32 % and 3.97 %, respectively. Alterations in BBD129 protein stability of missense variants were examined by MUpro software. MUpro predicted that both the substitutions negatively affect the protein stability of BBD129 protein. The substitution of serine to alanine amino acid at 57th position (S57A_rs378737321) decreased the protein stability by confidence score -0.83258649 and -0.998569279497182 in Support Vector Machine and Neural Network Machine, respectively. The substitution of asparagine to serine amino acid at 110th position (N110S_rs378737321) decreased the protein stability by confidence score -0.63001616 and -0.65131013529855 in Support Vector Machine and Neural Network Machine, respectively (supplementary table 1).
The possible impact of nsSNPs on BBD129 protein post-translation modification
NetOglyc 4.0 and GlycoEP servers predicted one deletion of the O-glycosylation site at the 57th amino acid position and one insertion of new O-glycosylation site at the 110th amino acid position in the double mutated BBD129 protein (Fig 4; supplementary table 1). SNPs have no effect on N-glycosylation. NetPhos 3.1 predicted fifteen sites for phosphorylations in the BBD129 TA haplotype protein (supplementary table 1) while two possible additional phosphorylation sites were predicted in the double mutated BBD129 GG haplotype protein. The motif CKKKTCCIR (52nd amino acid) was predicted as a potential candidate for threonine phosphorylation with 0.545/0.503 scores for Phosphokinase-G/Phosphokinase-A enzymes. Another motif IKSASAFAK (110th amino acid) was predicted as a new potential candidate for serine phosphorylation for the PKC enzyme with a 0.617 score. These new phosphorylation sites were the result of S57A and N110S substitutions, respectively (Fig 4 and supplementary table 1).
The possible impact of nsSNPs on BBD129 protein biological functions
As predicted above that nsSNPs decrease protein stability and influence the protein secondary structure and post-translation modifications, here we have predicted possible impacts observed nsSNPs on the biological functioning of BBD129 protein by Argot 2 server. The prediction found double mutated BBD129 protein has decreased scores for all predicted biological functions compared to native BBD129 protein suggesting nsSNPs negatively impact the protein biological functioning (Table 3).
Table 3
The summarized results of bioinformatics tools used for the predictions of various physiochemical and structural changes in the mutated BBD129 are provided in the table 4.
Table 4