Genome-Wide Identification and Codon Bias of NBS-LRR Gene Family in Banana

doi:10.21203/rs.3.rs-3249224/v1

Download PDF

Research Article

Genome-Wide Identification and Codon Bias of NBS-LRR Gene Family in Banana

https://doi.org/10.21203/rs.3.rs-3249224/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

As the largest family of plant resistance (R) proteins, nucleotide binding site-leucine-rich repeat ( NBS-LRR ) proteins play an important role in pathogen defense. In order to identify and get the codon usage bias of NBS-LRR gene family in banana. Using software MEGA11, TBtools and CodonW to analyze the codon preference and its influencing factors on genome-wide data from banana. The 74 NBS-LRR genes were divided into 6 subfamilies, 5 conserved motifs and 14 domains were identified. The more similar domain structures in the same phylogenetic tree subfamily and less consistent structures between different subfamilies. It was also found that when the G/C base appears at the third position of the codon, it had a higher usage rate. We identified 16 codons, including UCC and CCC, as the optimal codons. All of the 16 optimal codons ended with G or C, which indicated that the banana genome NBS-LRR gene prefers to use G or C terminated codons. Most of the gene points in the GC3s-ENC distribution map fell near the expected curve, mutation and natural selection during gene expression affected codon selection. The results of PR2-plot showed that most of the genes fell on the upper right side of the plane, and neutrality plot result showed that there was not significant correlation between C12 and GC3, which indicated the main factor of codon preference was natural selection. The results provide scientific basis for codon optimization of exogenous genes and improvement of their expression efficiency.

Banana

NBS-LRR

Genome-wide identification

Codon usage bias

Banana (Musaspp.) is a monocotyledonous perennial plant of the Musa genus in Musaceae. Banana is the largest herbaceous flowering plant in the world, and its fruit is edible. Banana grows in tropical and subtropical regions and is the fourth largest food crop after rice, wheat, and corn in some countries and regions (Saravanan et al. 2003). Banana production is severely threatened by fusarium wilt (Panama disease), which is caused by Fusarium oxysporum f. sp. cubense (Foc). Two Foc races, race 1 (Foc 1) and race 4 (Foc 4) are the predominant agents threatening global banana production (Ghag et al. 2015).

Revealing the interaction between banana pathogens and hosts, and revealing the structure, function and mechanism of banana disease-resistance-related genes have been hot issues in banana disease resistance research. However, the regulatory pathways and signaling mechanisms of disease resistance genes are still incomplete (Jiaman et al. 2019). The majority of the identified resistance genes in plants are called NBS ( nucleotide binding site ) resistance genes, which encode proteins containing a nucleotide binding site ( NBS ) and a leucine-rich repeat ( LRR )(Jia et al. 2013). The NBS region contains some very conserved motifs, and genes with conserved nucleotide binding sites NBS-encoded proteins are resistance genes widely present in plants(Jones and Dangl 2006). Threatened by a variety of environmental pathogens, plants have evolved a variety of effective strategies to combat pathogen infection. Initially, plants can induce PAMP-triggered immunity ( PTI ) to resist pathogens by recognizing pathogen-associated molecular patterns ( PAMPs ) (Dodds and Rathjen 2010). However, adaptive pathogens can inhibit PTI by secreting effector molecules in host plant cells. In the second defense strategy, plants in turn activate resistance genes to suppress pathogens in plants. According to the conserved structure of proteins, R proteins are divided into five categories(Tameling and Takken 2008). The NBS-LRR protein constitutes the largest family of resistance proteins. Since 2007, the NBS-LRR gene has been studied in wild and cultured bananas(Chang et al. 2020; Pei et al. 2007; Peraza-Echeverria et al. 2008; Sutanto et al. 2014). However, due to the lack of banana genome data, in these studies, the identification of NBS-LRRs was greatly limited by designing a set of degenerate primers based on the conserved NBS domain sequence(Martin et al. 2016).

Codon is the basic unit for the transmission of genetic information from base sequence to amino acid sequence, and it is the bridge connecting DNA and protein. In general, an amino acid can often be translated by multiple different codons, which is also called codon degeneracy(Arella et al. 2021). These codons responsible for encoding the same amino acid are called synonymous codons. It has been shown that there is a certain pattern in the coding of different biological proteins, also known as codon usage bias (CUB), in which certain codons are used in preference to others, called optimal codons (Liu 2006). The analysis of pathogen codon usage patterns can provide important insights into the regulation of pathogen molecular evolution, gene expression and protein synthesis.

In recent years, codon usage bias has been studied in rice(Liu 2012), cotton(Wang et al. 2018), tomato(Zhang et al. 2018) and other crop gene families or single functional genes. Although the whole genome sequence of banana has been published, the research on its codon usage characteristics is still blank, and the codon bias of banana NBS-LRR gene has not been reported. The genetic transformation system of banana has been established (Arvanitoyannis et al. 2008), while there are still some limiting factors such as low transformation efficiency and limited by variety genotype (Ansarypour and Shahpiri 2017). Combined with the characteristics of gene codon preference, the codons of key genes were modified to select the appropriate expression system, so as to improve the expression level of exogenous genes in banana.

The research features and contributions of this paper include the followings:(1) The NBS-LRR gene family in banana was identified, and the codon usage preference of banana whole genome NBS-LRR gene was analyzed for the first time. (2) We identified 16 codons, including UCC and CCC, the banana genome NBS-LRR gene prefers to use G or C terminated codons, and natural selection had a strong influence on the codon preference.

In this study, based on the existing gene coding information of high confidence protein in banana genome, NBS-LRR disease resistance gene was analyzed and screened, and the use of codons and codon usage preference of banana NBS-LRR genes were analyzed. It is hoped that the expression of disease resistance gene in banana can be improved by optimizing codons and using transgenic technology, which provides a theoretical basis for improving codon optimization of exogenous genes and their expression efficiency.

Data sources

The Arabidopsis and rice NBS-LRR protein sequences were downloaded from the TAIR (http://www.arabidopsis.org/) and RGAP (http://rice.plantbiology.msu.edu/) databases, respectively. We used these data to build Hidden Markov Model (HMM) profiles. The NB-ARC domain (PF00931) profile, as well as the Arabidopsis and rice HMM profiles, were then used to search against the banana protein database obtained from the Banana Genome Hub (https://banana-genome-hub.southgreen.fr). Following HMM search using the hmmer 3.2.1 program with the default parameter settings, the shared hits from above three HMM searchers were selected and furtherly validated with InterProScan program (Zdobnov and Rolf 2001) (http://www.ebi.ac.uk/interpro/search/sequence-search) and MEME software (Bailey et al. 2015) (http://meme.nbcr.net/meme/cgi-bin/meme.cgi).

After removing sequences containing only NB-ARC or LRR domains, targets harboring both NBS and LRR domains were selected as candidate banana NBS-LRR proteins. In order to minimize the sampling error, a Java program was used to screen coding sequence (CDS) that meet the following conditions (Yue et al. 2008): (1) the total number of bases must be an integer multiple of 3; (2) ATG was used as the starting codon; and (3) TAA, TAG or TGA was used as the termination codon in mRNA coding. Finally, 74 CDS that meet the above conditions were selected.

To identify the conserved domains and gene structures of banana NBS-LRR proteins, conserved motifs were identified with the MEME 5.5.2 program (Bailey et al. 2009), CD-Search (Aron et al. 2017) and TBtool software (Chen et al. 2020). A neighbor-joining phylogenetic tree was constructed based on the conserved NBS-LRR domain sequences using MEGA 11.0 software (Tamura et al. 2013) with bootstrap values for 1000 replicates.

Analysis of synonymous codon usage bias

The GC content and the percentage of bases A, T, G, C and G + C at the third position of the codon were calculated by CondonW1.4.2 software(http://codonw.sourceforge.net/). Relative Synonymous Codon Usage (RSCU), represents the relative usage of synonymous codons, that is, the ratio of the actual observed value of the sample synonymous codons to the average expected value of the synonymous codons. The RSCU is calculated as:

$${\text{RSC}}{{\text{U}}_{{\text{ij}}}}=\frac{{{X_{ij}}}}{{\sum\nolimits_{{j=1}}^{{{n_i}}} {{X_{ij}}} }}{n_i}$$

where ${X_{ij}}$ is the number of occurrence of the jth codon for the ith amino acid encoding by ${{\text{n}}_{\text{i}}}$synonymous codons. When the RSCU value is 1, it indicates that the codon usage is random and has no obvious preference. If RSCU > 1 or RSCU < 1, it indicates that the usage frequency of a codon is higher or lower than that of other synonymous codons (Liu et al. 2004).

Effective Number of Codons (ENC) is the number of effective codons, ranging from 20 to 61.

$$ENC=2+\frac{9}{{\overline {{{F_2}}} }}+\frac{1}{{\overline {{{F_3}}} }}+\frac{5}{{\overline {{{F_4}}} }}+\frac{3}{{\overline {{{F_6}}} }}$$

where $\overline {{{F_{\text{i}}}}}$ (i = 2,3,4,6) represents the average value of $\overline {{{F_{\text{i}}}}}$ for i-fold degenerate codon families. Using the following formula to calculate $\overline {{{F_{\text{i}}}}}$ value:

$$\overline {{{F_i}}} =\frac{{n\sum\nolimits_{{j=1}}^{i} {{{(\frac{{{n_j}}}{n})}^2} - 1} }}{{n - 1}}$$

where n represents the whole number of occurrence of the codons for that amino acid and ${n_j}$ is the number of occurrence of the jth codon for that amino acid. If the ENC is smaller, the preference for codon usage during gene expression is stronger (Gupta et al. 2004).

Codon Adaptation Index (CAI) refers to the matching degree between the synonymous codon and the best use of codon in the coding region, which is used to predict the expression level of intraspecific genes.

$$CAI=\frac{{CA{I_{obs}}}}{{CA{\operatorname{I} _{max}}}}=\frac{{\sqrt[L]{{\sum\nolimits_{{K=1}}^{L} {RSC{U_k}} }}}}{{\sqrt[L]{{\sum\nolimits_{{K=1}}^{L} {RSC{U_{kmax}}} }}}}=\sqrt[L]{{\sum\nolimits_{{K=1}}^{L} {\frac{{RSC{U_k}}}{{RSC{U_{k\hbox{max} }}}}} }}$$

where $RSC{U_{k\hbox{max} }}$ denotes the RSCU value of the optimal codon corresponding to the amino acid encoded by the kth synonymous codon in the highly expressed protein, and the meaning of L is the total number of codons used in the nucleotide sequence of the studied protein. The value of CAI is between 0 and 1, and the larger the value, the stronger the preference (Peixoto et al. 2003).

Frequency of Optimal Codons (FOP) refers to the percentage of optimal codons in the total number of codons. A3s, T3s, G3s, C3s, and GC3s represent the percentage of bases A, T, G, C, and GC at the third position of the codon, respectively. SPSS26.0 statistical software was used to analyze the correlation between codon composition and preference parameters (A3s, T3s, G3s, C3s, CAI, FOP, ENC, GC3s, GC).

Optimal codon determination

The optimal codon refers to the preferred codon. Firstly, the ENC values of all genes are calculated and sorted, and 5% of the sequence data in the upper and lower regions are selected to form two new data subsets, and then the RSCU values of each codon in the two data subsets are calculated. The codon with RSCU difference more than 0.3, greater than 1 in the high expression sample and less than 1 in the low expression sample is the optimal codon (dos Reis et al. 2003).

ENC-plot and neutral plot analysis

The theoretical ENC value in ENC-plot plot analysis was calculated by formula (2), and the standard curve was drawn with the theoretical ENC value as the vertical coordinate and GC3 as horizontal coordinate (Wang et al. 2020).

PR2-plot was used to analyze the composition of the third base of the codon encoding amino acids. G3/(G3 + C3 ) and A3/(A3 + T3) were used as horizontal and vertical coordinates for drawing analysis (Sueoka 1999). The point distribution around the center point (A = T, C = G) shows the degree and direction of the base deviation. Under the influence of mutation pressure, the A/T and C/G ratios of gene degenerate codons are balanced. On the contrary, the unbalanced distribution of codon usage indicates that codon preference is influenced by both natural selection and other factors (Xiang et al. 2015).The influencing factors of codon usage preference can be preliminarily judged by neutral plot analysis. The neutral map of banana NBS-LRR protein sequence was constructed with GC3 as the horizontal coordinate and GC12 as the vertical coordinate.

Identification of banana NBS-LRR protein family

Following HMM search using profiles as queries built from NB-ARC domain, Arabidopsis and rice NBS-LRR family proteins, a total of 74 NBS-LRR proteins were identified from banana genome.

Further conserved domain detection confirmed that all the identified NBS-LRR candidates contained both the NBS and LRR domains, and belonged to the NBS-LRR protein family. To investigate the evolutionary relationships between members of the banana NBS-LRR protein family, a phylogenetic tree using MEGA11.0 showed that the 74 NBS-LRR proteins were divided into 6 subfamilies, with the largest one containing 25 members and the smallest one with only one member (Fig. 1).

To depict the structural divergence of the NBS-LRR proteins, 5 conserved motifs and 14 domains were identified with the MEME program and annotated with the CD-Search tool. We found that more similar domain structures in the same phylogenetic tree subfamily and less consistent structures between different subfamilies. The members with similar motifs clustered in same clade in the phylogenetic tree. Each of the protein sequences contained an NB-ARC and LRR structure (Fig. 1).

Analysis of codon composition

The CondonW analysis of 74 CDS showed that the average value of CAI was 0.18 and the average value of ENC was 55.79, indicating that the codon usage bias was weak and the gene expression level was low. The average proportion of G or C in the third position of synonymous codon was 53.22%, which was slightly higher than the total amount of A and T, and the content of GC ranged from 40.40–62.30%. Meanwhile, statistical analysis of the type of codon third base showed that the average frequency of C3s was 32.20% and G3s was 36.14% respectively, while the average frequency of occurrence of A3s was 29.01% and 36.14% for T3s, and the content of C3s and G3s was higher than that of A3s and T3s.

Table 1

The composition and usage parameters of codons of NBS-LRR protein
Parameters of condons	Variation range	$\overline {x} \pm SD$
T3s(%)	9.67–40.61%	28.87 ± 6.23%
C3s(%)	22.10-50.51%	32.20 ± 6.22%
A3s(%)	11.11–39.74%	29.01 ± 6.12%
G3s(%)	26.14–54.05%	36.14 ± 5.30%
CAI	0.15–0.22	0.18 ± 0.02
Fop	0.33–0.47	0.39 ± 0.03
ENC	41.61–60.54	55.79 ± 3.40
GC3s(%)	36.40–82.70%	53.22 ± 9.39%
G + C(%)	40.40–62.30%	48.79 ± 4.43%
Aro(%)	5.46–8.89%	7.05 ± 0.70%

Results of codon preference analysis

CodonW software was used to analyze 74 amino acid sequences of banana NBS-LRR protein in detail. The preference and frequency of all amino acids to synonymous codons in banana NBS-LRR protein were shown in Table 2. When the RSCU value was greater than 1 and the number of codons with the third base being A, U, G and C was 2.34%, 37.49%, 13.48% and 46.69%, respectively. The number of codons ending with C was the most, and the number of codons ending with A was the least, which indicated that the banana NBS-LRR protein had stronger selectivity to the codons ending with C.

Table 2

Statistical analysis of synonymous codon usage of NBS-LRR protein
Amino acid	Codon	Number	RSCU	Amino acid	Codon	Number	RSCU
Phe	UUU	1079	0.91	Thr	ACU	655	0.84
	UUC	1299	1.09		ACG	686	0.88
Leu	UUA	923	0.52		ACC	909	1.17
	UUG	2396	1.35		ACA	857	1.10
	CUU	1733	0.98	Gln	CAA	1322	1.04
	CUG	2468	1.39		CAG	1216	0.96
	CUC	2068	1.17	Tyr	UAU	687	0.88
	CUA	1056	0.60		UAC	879	1.12
Val	GUU	957	0.89	Arg	CGU	486	0.81
	GUG	1592	1.47		CGG	626	1.05
	GUC	1087	1.01		CGC	533	0.89
	GUA	686	0.63		CGA	743	1.24
Ile	AUU	1162	0.84	Gly	GGU	935	0.88
	AUC	1969	1.42		GGG	904	0.85
	AUA	1029	0.74		GGC	1116	1.05
Ser	UCU	944	1.09		GGA	1292	1.22
	UCG	717	0.83	Cys	UGU	719	0.82
	UCC	992	1.15		UGC	1041	1.18
	UCA	812	0.94	Arg	AGA	1395	1.07
Ser	AGU	794	0.80		AGG	1214	0.93
	AGC	1192	1.20	His	CAU	1176	1.17
Ala	GCU	992	1.03		CAC	830	0.83
	GCG	651	0.68	Asp	GAU	2384	1.12
	GCC	1003	1.05		GAC	1859	0.88
	GCA	1192	1.24	Asn	AAU	1473	1.05
Pro	CCU	858	1.10		AAC	1332	0.95
	CCG	661	0.85	Lys	AAA	1686	0.81
	CCC	662	0.85		AAG	2488	1.19
	CCA	937	1.20	Glu	GAA	2440	0.87
Met	AUG	1721	1.00		GAG	3157	1.13
Trp	UGG	1277	1.00	TER	UAA	29	1.35
TER	UGA	32	1.00		UAG	14	0.65
__: Optimal codon

Optimal codon analysis results

After sorting the ENC values of 74 NBS-LRR genes from large to small, the first 5% and the last 5% gene data were taken out to form a new data set, and then the RSCU values of these genes were calculated and compared, and the RSCU difference was found to be greater than 0.3, and the RSCU > 1 in the high expression group and the RSCU < 1 in the low expression group. These codons are the optimal codons, a total of 16 codons meet the conditions, and a total of 10 codons end with C, no codons end with A, the optimal codon information is shown in Table 2.

ENC-plot analysis result

By drawing the correlation scatter plot of ENC and GC3 s, the codon usage bias can be effectively explained, and the relationship between them under the condition of no selection pressure can be reflected. In this study, the NBS-LRR protein ENC-plot curve is shown in Fig. 2. It can be seen from Fig. 2 that most of the gene points fall near and below the expected curve, a small part is above the curve, and some gene points are significantly deviated from the expected curve. This result shows that in addition to nucleotide composition, factors such as mutation and natural selection during gene expression also affect codon selection. Among them, the ENC value of the points below the expected curve is relatively low, and the codon usage has a more significant preference. On the contrary, points above the expected line have lower selectivity for codon usage.

PR2-plot and neutral plot analysis result

number of codons; GC3s, the G + C content at the third position of synonymous codons.

The results of PR2-plot in Fig. 3 showed that the scatter points in the four regions were unevenly distributed, most of the genes fell on the upper right side of the plane, and the frequency of four bases was inconsistent, which were A > T and G > C, respectively. It can be seen that the codon preference of protein-coding genes may be affected by selection factors.

The influencing factors of codon usage preference can be preliminarily judged by neutral plot analysis. The neutral map of NBS-LRR gene was constructed with GC3 as the abscissa and GC12 as the ordinate. As shown in Fig. 4, the regression coefficient of GC12 and GC3 was 0.2288, indicating that there was not significant correlation between them ( R² = 0.6962 ). The base composition of the third site of the NBS-LRR gene codon was quite different, the main factor of codon preference was natural selection (Chakraborty et al. 2019).

As the largest number of plant R genes, NBS-LRRs play vital roles in plant defense to various environmental pathogens. The completion of whole-genome sequencing of bananas has enabled us to carry out systematic and in-depth studies of the banana NBS-LRR gene family. Each of the final 74 members contained the typical NB-ARC and LRR structural domains at the N-terminus and C-terminus respectively. NBS-LRR genes have been studied in several banana species (Musa sp.) over the past decade, however, all NBS-LRR genes in these studies were obtained based on degenerate primers designed according to the conserved NBS region. As a result, the number and sequence integrity of the identified NBS-LRRs were largely limited.

In the process of gene expression, the selective use of codons varies from species to species. Different organisms have different preferences for synonymous codons encoding the same amino acid, which is closely related to their own genetic characteristics. Therefore, when exogenous genes are transferred into organisms, the expression process is affected by codon selectivity and is prone to methylation, which may cause transgenic silencing or low gene expression levels. When the exogenous gene has similar selectivity to the codon usage of the organism 's own gene, the exogenous gene can be easily expressed (Shi et al. 2006). The three bases of the codon will mutate during evolution, but the significance of the mutation is very different. The mutations of the first two bases are generally non-synonymous mutations that will change the function of the gene. The mutations of the third base are mostly synonymous mutations that have no effect on gene function. Therefore, the analysis of the influencing factors of the third base may reveal the reason for the codon preference.

The 74 NBS-LRR genes were divided into 6 subfamilies, 5 conserved motifs and 14 domains were identified with the MEME program and annotated with the CD-Search tool. We found that more similar domain structures in the same phylogenetic tree subfamily and less consistent structures between different subfamilies. Each of the protein sequences contained an NB-ARC and LRR structure. We also found that when the G/C base appears at the third position of the codon, it had a higher usage rate. At the same time, all of the 16 optimal codons end with G or C, indicating that the banana genome NBS-LRR gene prefers to use G or C terminated codons, which was the same as the preference of genomic codons such as soybeans, grapes, and straw mushrooms. Most of the gene points in the GC3s-ENC distribution map fell near the expected curve, mutation and natural selection during gene expression affect codon selection. The results of PR2-plot showed that most of the genes fell on the upper right side of the plane, and the neutrality plot result showed that there was not significant correlation between C12 and GC3, which indicated the main factor of codon preference was natural selection.

The NBS-LRR gene is the most important disease resistance gene in plants, and a variety of diseases, especially banana wilt, are a serious threat to the banana industry, so it is of great interest to study the banana NBS-LRR disease resistance gene. In transgenic technology breeding, the codon preferences of NBS-LRR gene and banana genome are compared, and the NBS-LRR gene is modified according to the banana genome codon preferences to improve codon optimization of exogenous genes and its expression level.

In this study, the NBS-LRR gene family in banana was identified, and the codon usage preference of banana whole genome NBS-LRR gene was analyzed for the first time. The 74 NBS-LRR genes were divided into 6 subfamilies, 5 conserved motifs and 14 domains were identified. The banana genome NBS-LRR gene prefers to use G or C terminated codons, and all of the 16 optimal codons end with G or C. The natural selection had a strong influence on the codon preference. The results provide scientific basis for codon optimization of exogenous genes and improvement of their expression efficiency.

R, resistance; NBS-LRR, nucleotide binding site-leucine-rich repeat; NB-ARC, nucleotide binding-APAF-1 and CED-4; Foc, Fusarium oxysporum f. sp. cubense; PTI, PAMP-triggered immunity; PAMPs, pathogen-associated molecular patterns; HMM, Hidden Markov Model; CUB, codon usage bias; CDS, coding sequence; RSCU, relative synonymous codon usage; ENC, effective number of codons; CAI, codon adaptation index; FOP, frequency of optimal codons.

Author contributions HF, and BM conceived and designed the research, interpreted the results, and wrote the manuscript; ML and SY prepared the materials and conducted the experiments; HF and JS contributed to the data analysis and preparations of figures. All authors read and approved the final manuscript.

Funding This work was supported by the Natural Science Foundation of Guangxi under Grant No. 2020GXNSFAA259004, 2021GXNSFAA196014. Funding support was also provided by Guangxi Science and Technology Major Special Project (GuiKeAA20108003-5) and Guangxi Science and Technology Base and Talent Project (GuiKeAA21196005). We also thank the funding support by Guangxi Academy of Agricultural Sciences (GuiNongKe2020YM106).

Data availability All data generated or analyzed during this study are included in this published article. The data and materials generated during and analyzed during the current study are available from the corresponding author upon reasonable request.

Conflicts of interest None of the authors have any potential conflicts of interest associated with this research.

Ansarypour Z, Shahpiri A (2017) Heterologous expression of a rice metallothionein isoform (osmti-1b) in saccharomyces cerevisiae enhances cadmium, hydrogen peroxide and ethanol tolerance. Braz J Microbiol 48(3):537–543
Arella D, Dilucca M, Giansanti A (2021) Codon usage bias and environmental adaptation in microbial organisms. Mol Genet Genomics 296(3):751–762
Aron MB, Yu B, Lianyi H, Jane H, Lanczycki CJ, Shennan L, Farideh C, Derbyshire MK, Geer RC, Gonzales NR (2017) Cdd/sparcle: Functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. (D1):D200–D203
Arvanitoyannis IS, Mavromatis AG, Grammatikaki-Avgeli G, Sakellariou M (2008) Banana: Cultivars, biotechnological approaches and genetic transformation. Int J Food Sci Technol 43(10):1871–1879
Bailey TL, Johnson J, Grant CE, Noble WS (2015) The meme suite. Nucleic Acids Res 43(W1):W39–W49
Bailey TL, Mikael B, Buske FA, Martin F, Grant CE, Luca C, Jingyuan R, Li WW, Noble WS (2009) Meme suite: Tools for motif discovery and searching. Nucleic Acids Res. 37(Web Server issue):W202–208
Chakraborty S, Deb B, Barbhuiya PA, Uddin A (2019) Analysis of codon usage patterns and influencing factors in nipah virus. Virus Res 263:129–138
Chang WJ, Li H, Chen HQ, Qiao F, Zeng HC (2020) Nbs-lrr gene family in banana (musa acuminata): Genome-wide identification and responses to fusarium oxysporum f. Sp. Cubense race 1 and tropical race 4. Eur J Plant Pathol 157(3):549–563
Chen C, Chen H, Zhang Y, Thomas HR, Xia R (2020) Tbtools: An integrative toolkit developed for interactive analyses of big biological data. Mol Plant 13(8)
Dodds PN, Rathjen JP (2010) Plant immunity: Towards an integrated view of plant-pathogen interactions. Nat Rev Genet 11(8):539–548
dos Reis M, Wernisch L, Savva R (2003) Unexpected correlations between gene expression and codon usage bias from microarray data for the whole escherichia coli k-12 genome. Nucleic Acids Res 31(23):6976–6985
Ghag SB, Shekhawat UKS, Ganapathi TR (2015) Fusarium wilt of banana: Biology, epidemiology and management. Int J Pest Manage 61(3):250–263
Gupta SK, Bhattacharyya TK, Ghosh TC (2004) Synonymous codon usage in lactococcus lactis: Mutational bias versus translational selection. J Biomol Struct Dyn 21(4):527–535
Jia RZ, Ming R, Zhu YJ (2013) Genome-wide analysis of nucleotide-binding site (nbs) disease resistance (r) genes in sacred lotus (nelumbo nucifera gaertn.) reveals their transition role during early evolution of land plants. Trop Plant Biology 6(2–3):98–116
Jiaman S, Jinzhong Z, Hui F, Liyun P, Shaolong W, Chaosheng L, Sijun Z, Jiang L (2019) Comparative transcriptome analysis reveals resistance-related genes and pathways in musa acuminata banana 'guijiao 9' in response to fusarium wilt. Plant Physiol Biochem 141:83–94
Jones JDG, Dangl JL (2006) The plant immune system. Nature 444(7117):323–329
Liu QP (2006) Analysis of codon usage pattern in the radioresistant bacterium deinococcus radiodurans. BioSystems 85(2):99–106
Liu QP (2012) Mutational bias and translational selection shaping the codon usage pattern of tissue-specific genes in rice. PLoS ONE 7(10):7
Liu QP, Feng Y, Xue QZ (2004) Analysis of factors shaping codon usage in the mitochondrion genome of oryza sativa. Mitochondrion 4(4):313–320
Martin G, Baurens FC, Droc G, Rouard M, Cenci A, Kilian A, Hastie A, Dolezel J, Aury JM, Alberti A et al (2016) Improvement of the banana "musa acuminata" reference sequence using ngs data and semi-automated bioinformatics methods. BMC Genomics 17:12
Pei XW, Li SJ, Jiang Y, Zhang YQ, Wang ZX, Jia SR (2007) Isolation, characterization and phylogenetic analysis of the resistance gene analogues (rgas) in banana (musa spp). Plant Sci 172(6):1166–1174
Peixoto L, Zavala A, Romero H, Musto H (2003) The strength of translational selection for codon usage varies in the three replicons of sinorhizobium meliloti. Gene 320:109–116
Peraza-Echeverria S, Dale JL, Harding RM, Smith MK, Collet C (2008) Characterization of disease resistance gene candidates of the nucleotide binding site (nbs) type from banana and correlation of a transcriptional polymorphism with resistance to fusarium oxysporum f.Sp cubense race 4. Mol Breed 22(4):565–579
Saravanan T, Muthusamy M, Marimuthu T (2003) Development of integrated approach to manage the fusarial wilt of banana. Crop Prot 22(9):1117–1123
Shi XL, Wang XY, Li Z, Zhu QH, Tang W, Ge S, Luo JC (2006) Nucleotide substitution pattern in rice paralogues: Implication for negative correlation between the synonymous substitution rate and codon usage bias. Gene 376(2):199–206
Sueoka N (1999) Translation-coupled violation of parity rule 2 in human genes is not the cause of heterogeneity of the DNA g + c content of third codon position. Gene 238(1):53–58
Sutanto A, Sukma D, Hermanto C, Sudarsono (2014) Isolation and characterization of resistance gene analogue (rga) from fusarium resistant banana cultivars. Emir J Food Agric 26(6):508–518
Tameling WIL, Takken FLW (2008) Resistance proteins: Scouts of the plant innate immune system. Eur J Plant Pathol 121(3):243–255
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) Mega6: Molecular evolutionary genetics analysis version 6.0. Mol Biol Evol 30(12):2725–2729
Wang LY, Xing HX, Yuan YC, Wang XL, Saeed M, Tao JC, Feng W, Zhang GH, Song XL, Sun XZ (2018) Genome-wide analysis of codon usage bias in four sequenced cotton species. PLoS ONE 13(3):17
Wang Z, Xu B, Li B, Zhou Q, Xu Z (2020) Comparative analysis of codon usage patterns in chloroplast genomes of six euphorbiaceae species. PeerJ 8(1):e8251
Xiang H, Zhang R, Butler RR, Liu T, Li Z, Jean-Fran P, Zhou Z, Ling E (2015) Comparative analysis of codon usage bias patterns in microsporidian genomes. PLoS ONE 10(6):e0129223
Yue J, Fei D, Wang H, Hu Z (2008) An extensive analysis on the global codon usage pattern of baculoviruses. Arch Virol 153(12):2273–2282
Zdobnov EM, Rolf A (2001) Interproscan–an integration platform for the signature-recognition methods in interpro. Bioinformatics (9):847–848
Zhang RZ, Zhang L, Wang W, Zhang Z, Du HH, Qu Z, Li XQ, Xiang H (2018) Differences in codon usage bias between photosynthesis-related genes and genetic system-related genes of chloroplast genomes in cultivated and wild solanum species. Int J Mol Sci 19(10):24

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Genome-Wide Identification and Codon Bias of NBS-LRR Gene Family in Banana

Status:

Version 1

Abstract

Figures

Introduction

Materials and Methods

Data sources

Analysis of synonymous codon usage bias

Optimal codon determination

ENC-plot and neutral plot analysis

Results

Identification of banana NBS-LRR protein family

Analysis of codon composition

Results of codon preference analysis

Optimal codon analysis results

ENC-plot analysis result

PR2-plot and neutral plot analysis result

Discussion

Conclusion

Abbreviations

Declarations

References

Additional Declarations

Status:

Version 1