Transcriptome analysis of Nelumbo nucifera in response to antimony stress

To explore the probable mechanism of antimony stress resistance by Nelumbo nucifera and screen out relevant antimony-resistant genes, we conducted transcriptome sequencing of Nelumbo nucifera seeds treated by antimony at low, medium, and high concentrations (100, 500, 1000 mg/L respectively), screened differentially expressed genes (DEGs), and analyzed gene ontology classication and KEGG enrichment.Totally 18078 DEGs between antimony stress treatments and the control were identied, and 4192, 5850 and 8036 DEGs were found respectively.The pathways underlying the degradation of phosphopentose, limonene and pinene, the biosynthesis of avonoid and ubiquinone, and the biosynthesis of terpenoids play critical roles in this responding process. The expressions of 7 genes validated by qPCR were consistent with the RNA-Seq results, which conrms the reliability of RNA-Seq results.This study on transcriptome analysis of Nelumbo nucifera under antimony stress provides a genetic resource and a theoretical basis for research on plant restoration of antimony pollution. Transcriptome analysis found that the biosynthesis of limonene and pinene, avonoids and ubiquinone, and the biosynthesis of terpenoids play a key role in the response of Nelumbo to antimony stress. nucifera.


Introduction
Antimony (Sb) with toxicity and potential carcinogenicity is listed as a priority pollutant by many countries and international organizations. The lack of appropriate planning and management in antimony mining areas and the inappropriate use of antimony in previous years have led to the soil, water and air antimony concentrations far above state speci ed standards, which will seriously threaten the humanity and the whole ecosystem benign circulation. China has the largest antimony resource in the world and is faced by soil hardening and crop yield reduction due to antimony pollution, which should be solved urgently (Fei, et al. 2017;He 2012;Wang, et al. 2010). Plant restoration is a relatively economical and safe environmental restoration technology and has broad application prospects. However, due to the lack of antimony restoration plant materials, the existing studies about the effects of antimony on plants focus on the effects of antimony on early growth of plants (Baek, et al. 2014), differences in antimony accumulation among various plants (Tschan, et al. 2009), measurement of different forms of antimony in plant tissues (Tisarum, et al. 2015), and evaluation of ecological risks (Wilson, et al. 2010) and toxicity (Shtangeeva, et al. 2011;Zhong, et al. 2020). Nevertheless, the mechanism of antimony resistance by plants is unclear, which limits the technical applications of plants into restoration of antimony pollution.
Antimony existing in solutions can be easily absorbed by plants and will compete with some necessary metabolites, which thereby cause toxic hazards to plants. So far, the mechanism how plants respond to antimony stress is unknown. Recently, genomics, transcriptomics and proteomics have become increasingly important in research on the growth, development, phyletic evolution and biological toxicology of plants. Transcriptomics refers to the research on the speci c cells, tissues or organs under speci c growth and development stage or under certain physical conditions, and has been widely applied following the development of post-genomics . When biological stress occurs, organisms can adjust relevant gene expressions, and the regulation of genetic transcription will further adjust the complex regulatory mechanisms during development (Lata, et al. 2018;Mittler 2006). The transcriptome sequencing based on second-generation sequencing is capable of large-scale high-throughput sequencing of total RNA (including mRNA, miRNA, lncRNA, circRNA), and thus has been extensively applied into research on transcriptome-level biological/nonbiological stress of plants and animals. Analysis of transcriptome graphs can reveal various information, including gene expressions, alternative splicing, gene structures, new genes, differentially expressed genes (DEGs), and mononucleotide fragment polymorphyism, and can annotate the obtained genes and further interpret the molecular mechanisms of plant actions under unfavourable conditions. Bowl lotus is a perennial aquatic plant with multisectional rhizome, and has long and thick underground stems. Bowl lotus growing throughout China can be extensively planted in lakes and rivers and be used into water landscaping, scenery appreciation, and water puri cation. Bowl lotus seeds can germinate as long as the seed shells can absorb enough water and swell. The short period of germination is favorable for experimental observation. In this study targeting at Nelumbo nucifera seeds, the DEGs of Nelumbo nucifera under stress of different antimony concentrations were investigated by transcriptomics sequencing and real-time quantitative polymerase chain reaction (RT-qPCR). Then the relevant genes of bowl locus under antimony stress were screened out, and key regulatory genes were recognized.

Materials
High-quality Nelumbo nucifera seeds with basically the same saturation degree were selected. After opening, the seeds were cleaned with distilled water 3-4 times, soaked in 2% hypochlorous acid for 10 min, washed with distilled water 3-4 times and soaked in clear water for 24 hours. The soaked seeds were placed into glass vessels (8 seeds/vessel) and treated with different concentrations of antimony (seeds soaked with clean water serving as the control): low, medium and high concentrations: 100 mg/L (T100), 500 mg/L (T100) and 1000 mg/L (T1000) respectively. During the 7 days of treatments at 25°C, seed germination of each group was observed every day. Each treatment was biologically repeated 3 times. After that, the samples with basically the same growing conditions were selected and stored in liquid nitrogen and dry ice until used.

Total RNA extraction and transcriptome sequencing
After fast freezing in liquid nitrogen, the samples stored in dry ice were sent to Shanghai Meiji Biology Co., Ltd. for total RNA extraction and sequencing. Total RNA was extracted from each sample and then its concentration and purity were detected using a Nanosrop2000 device. RNA integrity was tested by agarose gel electrophoretic analysis, and RIN was monitored by an Agilent2100 instrument. Magnetic beads containing Oligo(dT) were used for A-T base matching with polyA, and mRNA was thereby isolated from the total RNA. Then a fragmentation buffer was added, and the mRNA was randomly broken into small fragments in size of 300 bp. Under the action of reverse transcriptase, random hexamers were added, and the single strand of cDNA was synthesized via reverse-transcription with mRNA as the template. After that, the second strand was synthesized, forming a stable double-strand structure. An End Repair Mix was added to complement the double-strand cDNA into a at end. Then the 3' end was added with -"A" for connecting the Y-shaped head (sequence of head: 5': AGATCGGAAGAGCACACGTC 3': AGATCGGAAGAGCGTCGTGT).
The sequence signals were converted, via CASAVA base calling on Illumina, into text signals, which were stored as the raw data in the format of fastq. The raw data contained the sequences of the sequencing heads, lowquality fragments, high-N (N: uncertain information) sequences, and over-short sequences, which will severely lower the quality of subsequent analysis. Hence, the original sequencing data were sent into quality control in advance, so that high-quality clean data were obtained, which ensured the accuracy of the subsequent analysis. Sequencing was conducted by using an Illumina TruseqTM RNA sample kit, with quanti cation by TBS380. The sequences were subjected to the machines in the mixing form. The cBot was sent to bridge-typed PCR ampli cation, forming clusters. Sequencing was performed on Illumina (PE library, reading length 2*150bp).
Cutting was conducted on SeqPrep and Sickle. The heads, repeated or redundant sequences, blur bases, and reads in length < 60bp (reads except 15-34nt on miRNAs) were removed from the raw data. The sequences were matched with the genome of Nelumbo nucifera on TopHat2.0. After quality control and cutting, the data were subjected to statistical analysis and quality assessment again, including statistics of distributions of (1) A/T/G/C concentrations, (2) base quality, and (3) base error rates. Based on the known reference genome, the mapped reads were assembled and spliced on Cu inks, and compared with some known transcripts. In this way, the transcripts without annotations were obtained, and the functions of the potential new transcripts were annotated.

Screening of DEGs in Nelumbo nucifera under antimony stress
Firstly, the gene expressions in the samples were assessed using TPM. Speci cally, rst the gene lengths and then the sequencing depths were uniformized, so that the total expression levels among different samples were consistent, which helped with the more visual comparison of expression levels among genes. The expression levels of genes and transcripts were quantitatively analyzed on RSEM, and DEGs among different processed samples were identi ed. Moreover, the DEGs were tested by signi cance analysis (P < 0.05). With a blank without antimony stress treatment as the control, the up-or down-regulation of genes in the low-, medium-and high-antimony groups was statistically analyzed.

Functions and annotations of DEGs
The DEGs as-obtained were subjected to gene ontology (GO) enrichment analysis on GOseq at the signi cant level of P ≤ 0.05. From the GO analysis, the annotations of the DEGs in the KEGG database can be determined, so as to understand the major metabolic pathways and signal transduction pathways involving these DEGs.

RT-qPCR of DEGs
The reliability of the transcriptome data was validated via RT-qPCR. With the 4-fold dilute of the reversetranscribed cDNA as the template, the DEGs with |log2(fold-change)|≥1 or ≤ 0.5 in the transcriptome sequences were selected and sent to RT-qPCR. According to the sequences as-obtained, primers were designed on Primer premier 6.0 (Biosoft International, Palo Alto, CA, USA), and the samples were validated by RT-qPCR. Each gene ampli cation was accompanied with the simultaneous ampli cation of the internal reference gene (β-Actin). The sequences of the genes and their primers were listed in Table 1. The transcriptomes from different groups were subjected to high-throughput sequencing on IlluminaHiseq, and the characteristics of the DEGs as-obtained were analyzed. Totally 73046Gb clean data were obtained, and the amount of clean data from each sample was up to 6.28bG. The error rate below 0.025Q20 was larger than 98%, and that below Q30 was above 94%. These results indicate our transcriptome data are of high quality and meet the basic requirements for transcriptome analysis (Table 2). The clean reads were matched with the reference genes on Hisat2, and the database of the reference genes was Nelumbo_nucifera. The maatching rate was above 92.96%. The proportion of multiple sequences was 1.39-1.54%, and the proportion of single sequences was above 91% (Table 3). The genes and transcripts as-obtained were compared with some major databases (NR, Swiss-Prot, Pfam, EggNOG, GO, KEGG), so as to fully identify the functions of the genes and transcripts and to statistically analyze the annotations based on these databases (  (Table 4). These data suggest the functions of many genes of Nelumbo nucifera have not been fully de ned.

Function annotations of DEGs
Together with the GO database, the DEGs from the 3 antimony-treated groups were classi ed. Macroscopically, three categories were identi ed: bioprocesses, cell components, and molecular functions. Then the top 20 GO terms ranked by the signi cance (the smallest p-values) of GO enrichment were displayed in Fig. 1. Since the same transcriptome may be mapped to different nodes, among all 21065 terms of bioprocesses, accounting for 40.6%, the largest numbers of unigene were involved in metabolic processes (4819, 22.88%) and cell processes (5187, 24.62%). There were 20061 terms of cell components, accounting for 38.67%, and the largest numbers of unigene were involved in cells (4243, 21.15%) and cellular components (4202, 20.95%). There were 10752 terms of molecular functions, accounting for 20.73%, and the largest numbers of unigene were involved in catalytic activity (4179, 38.87%) and binding proteins (4884, 45.42%) (Fig. 1).

Screening and annotation of DEGs in Nelumbo nucifera under antimony stress
The threshold of DEG selection was |log2FC|≥1 or P < 0.05, and the analytical software was edgeR. In T100, totally 4902 DEGs were found, including 1556 upregulated genes and 3346 downregulated genes. In T500, totally 5769 DEGs were identi ed, including 2324 upregulated genes and 3445 downregulated genes. In T1000, totally 8253 DEGs were recognized, including 3103 upregulated genes and 5150 downregulated genes (Fig. 2).

KEGG annotations and metabolic pathways of DEGs in Nelumbo nucifera under antimony stress
The DEGs of Nelumbo nucifera were matched with the KEGG database, and were annotated into six rst-level categories and 25 second-level categories, which involved 142 metabolic pathways. The pathways that each involved more than 10 DEGs were mainly from six aspects, including genetic information processing, metabolic processes, cellular processes, environmental information processing, biosystems, and human diseases (Fig. 4). Among these six aspects, the dominant one was metabolism, including carbohydrate metabolism, amino acid metabolism, amino acid folding/classi cation/degradation, lipid metabolism, nucleotide metabolism, and metabolism of other secondary substances. Only two metabolic pathways involved human diseases: endocrine metabolic diseases, and drug tolerance: antimicrobial. KEGG analysis of antimony stress showed the involvement of gene changes in human diseases, indicating antimony affected the endocrine metabolism and drug tolerance of humans to some extent.
Note Y-axis: name of KEGG metabolic pathway; X-axis: number of genes/transcripts annotated to this pathway. The seven categories of KEGG metabolic pathways include metabolism, genetic information processing, environmental information processing, cellular processes, organismal systems, human diseases, and drug development. (note: the number of categories differs among different species).
The pathway enrichment analysis showed the above 21 DEGs responding to antimony stress mainly involved carbohydrate metabolism (

RT-qPCR of DEGs in Nelumbo nucifera
To validate the accuracy of the transcription results, we selected 6 DEGs with signi cant changes, and designed primers on perimer8.0 for validation by qPCR. The validation genes as-selected were LOC104601801(BKI1) LOC104588398(GID1C). The qTR-PCR results of these 6 genes are highly consistent with their expression trends detected by transcription analysis.

Discussion
The intake of heavy metals, which are non-necessary elements for plant growth, will interfere with the normal growth of plants and threaten the health of humans and ecosystems through food chains. Heavy metal stress will affect a series of reactions in plants, including gene expressions under stress conditions, cell metabolisms, growth and development. Transcriptomics analysis deepens our knowledge on the responses of plants to heavy metal stress and helps to discuss the possibility of heavy metal detoxifcation by plants. Previous studies about the effects of Cd on rice indicate Cd can induce metabolism and ATP-related regulatory proteins in roots, and signi cant upregulation of proteins related with antioxidant Cd stress in seeds (Ahsan, et al. 2007;Chen, et al. 2018). For castors under Cd stress, the genes related with proline and soluble phenol are upregulated, and thus, these substances become are permeable and will be more absorbed by plants, which improve the heavy metal tolerance of plants to some extent (Liu, et al. 2011). When plants respond to drought, high salts, heavy metals and other types of environmental stress, the growth-related genes will be downregulated to enhance the stress resistance (Gong, et al. 2020). Antimony can inhibit the early growth of Chinese cabbages, rapes, wheat and cucumbers (Baek, et al. 2014). The upregulation of BRs will promote the synthesis of brassinosteroids and thereby relieve the toxicity of antimony stress to Arabidopsis thaliana (Wu, et al. 2019). The above studies indicate that plants will use different adaptation mechanisms in response to heavy metal stress. In the present study, the number of DEGs increases with the rise of antimony concentration, and is larger under high concentration treatment (T1000) than under low concentration treatment (T100), and the number of downregulated genes is signi cantly larger than that of upregulated genes (P ≤ 0.05) (Fig. 2). Hence, it is deduced that Nelumbo nucifera is not sensitive to antimony mass concentrations, and the majority of DEGs will be differentially expressed only under speci c mass concentrations. Bowl lotus responds to antimony stress mainly by downregulating DEGs. When the differential expression was magni ed by 1000 times, 21 genes were screened out (Table 5), indicating these genes are critical in the response to antimony stress by Nelumbo nucifera and can serve as key candidate genes in the response of Nelumbo nucifera to antimony stress. These functions will be validated in future research. KEGG metabolic pathway analysis shows that the DEGs are enriched in avonoid biosynthesis, pentose and gluconic salt interconversion, isoquinoline biosynthesis, phosphopentose pathway, and biosynthesis of ubiquinone and other terpenoid-quinones (Fig. 4). The top 20 terms of signi cant DGE enrichment include pectinesterase (PME53), tocopherol-methyl transferase (VTE4), 6-phosphogluconate (6PGL), and avonoid3hydroxylase (F3H). Secondary metabolites are a group of small-molecular organic compounds that are produced by secondary metabolism and not needed by the growth and development of plants. In response to environmental stress, the secondary metabolism of plants will be further induced, and more metabolites will be produced to improve the surviving and self-protecting abilities. This is a result from the interaction of plants with biological and nonbiological factors during long-term evolution (Choi, et al. 2014;Guerriero, et al. 2018;Mishra, et al. 2019). The secondary metabolites of plants can be divided into phenylpropanoids, terpenoids, avonoids, quinones, tannins, and sterides (Bouwmeester, et al. 2003). The environmental stress of nutrition de ciency will promote the production of secondary metabolites in plants (Bais 2010;Grove, et al. 2012). The concentrations of phenols produced by marigolds under drought stress are signi cantly larger than those under water su ciency (Tang, et al. 1994). Under high-concentration Pb treatment, the concentrations of key enzymes in secondary metabolites and secondary metabolism activities of dry grasses increase (Wang, et al. 2011). The concentrations of trans-anethole and oxygen-containing compounds signi cantly decrease, and the concentrations of limonene and monoterpenes considerably rise (Lima, et al. 2013). Bowl lotus in response to antimony stress will downregulate the genes related with secondary metabolism, so as to lower the biosynthesis of limonene, pinene and quinones.
Flavonoids are an important group of antibacterial and anti-stress compounds in plants, and are excellent antioxidants that can remove H 2 O 2 around cell membranes. The water extracts of avonoids from Ipomoea aquatica Forsk and Enydra uctuans Lour. can clear away free radicals and improve the activities of glutathione-S-transferase, glutathione reductase, catalase and superoxide dismutase, thereby enhancing the antioxidant capacity in vivo (Dua, et al. 2015). F3H is a key enzyme in the synthesis of avonoids, and the gene F3H controlling the synthesis of this enzyme is critical in the ower color formation of plants and in the stress resistance of plants (Mahajan and Yadav 2014;Song, et al. 2016). During the response of Nelumbo nucifera to antimony stress, F3H is downregulated in the biosynthesis of avonoids, and the expression levels are more than 1000 times different between T1000 and the control (|log2FC|≥10) (Fig. 5), and other relevant genes including FLS1 and FL are downregulated, indicating avonoids play an important role in the response of Nelumbo nucifera to antimony stress.
The phosphopentose pathway, a key pathway of sugar metabolism in plants, is closely related to various nonbiological stress, growth and development of plants. This pathway is involved in the early response of plants to non-biological stress (Cardi, et al. 2011;Pandey, et al. 2015). As a metabolic sensor under oxygen stress, it is pivotal in balancing oxido-reduction in cells Kruger, et al. 2011;Rokitta, et al. 2012). 6PGL is an enzyme in catalysis of the phosphopentose pathway, and can catalyze the hydrolysis of 6PGL. The activity of 6PGL is not necessary for the normal growth and propagation of yeast cells (Murthi, et al. 2010;Stanford, et al. 2004). The size of 6PGL mutant strains in Arabidopsis thaliana is about half that in wide-type strains, indicating 6PGL can affect the size of plants (Xiong, et al. 2009). When temperature sharply drops, the relative expression level of 6PGL is suddenly increased, which enhances the phosphopentose pathway. On the one hand, this pathway increases the production of some intermediates (e.g. RuBP, NADPH), providing raw materials and exclusive donors to the growth of plants and ensuring the production and utilization of ATP. On the other hand, accumulation of 6-phosphogluconate, the nal product of the phosphopentose pathway, will intensify cell osmotic pressure and enhance freezing resistane of plants (Savitch, et al. 2000;Stincone, et al. 2015). Wheat will relieve the harms of Al stress by increasing the production of enzymes related to the phosphopentose pathway (Zhang, et al. 2013). It is deduced that in the response of Nelumbo nucifera to antimony stress, the genes related with the phosphopentose pathway are downregulated, so as to relieve the harms of antimony stress to Nelumbo nucifera.

Conclusions
The transcriptomes of Nelumbo nucifera under different levels of antimony stress were analyzed by IIuminaHISEq. As the antimony concentration rose, the number of DEGs in Nelumbo nucifera increased. These DEGs are mainly enriched in carbohydrate metabolism, amino acid metabolism, amino acid folding/classi cation/degradation, lipid metabolism, nucleotide metabolism, and metabolism of other secondary substances. In the response to antimony stress, the degradation of phosphopentose, limonene, pinene, and the biosynthesis of avonoids, ubiquinone and other terpenes are very important. From the DEGs, 21 candidate genes that may be closely related to the response of Nelumbo nucifera to antimony stress were screened out, which will offer a gene resource and a theoretical basis for further functional validation and deep research on these DEGs.

Declarations
Compliance with ethical standards Con ict of interest The authors declare that the research was conducted in the absence of any commercial or nancial relationships that could be construed as a potential con ict of interest.   avonoid synthesis pathway in response to antimony stress Note: yellow: the reference transcript; red frames: the upregulated transcripts; red yellow: reference gene transcripts and new gene transcripts.