Genome-Wide Identi cation and Characterization of SAC Domain- Containing Protein Family in Cotton

Background: Phosphoinositides(PIs) are important regulators of a diverse range of cellular functions. The Suppressor of Actin (SAC) domain-containing proteins are a class of phosphoinositide phosphatase involved in the synthesis of PIs. Though the cellular functions of SAC domain-containing proteins have been characterized in yeast, information of SAC genes in cotton is largely undened. Results: In the present study, 12,12and24 putative SAC genes were identied in the G. ramondii, G. arboreum and G. hirsutum respectively. Detailed gene information, including genomic organization, structural feature, conserved domain and phylogenetic relationship of the genes were systematically characterized. All SAC family members in cotton were divided into three clades, Group I, Group II and Group III, based on their sequence similarities and phylogenetic relationship. The SAC domains consist of seven highly conserved motifs that are believed to be important for the phosphoinositide phosphatase activities from yeast to animal. Expression analysis of GhSAC from Group II and Group III shared similar moderate pattern in different tissues and insensitive to different abiotic stresses. Different members in Group I showed different expressional proles. Four genes (GhSAC2.1A/GhSAC2.1D and GhSAC4.2A/GhSAC4.2D) from Group I predominantly expressed in anther, pistil and petal. The results suggested the functional divergence among different groups and members of SAC in cotton. Conclusions: Systematical analysis of the SAC gene family in cotton provided a solid foundation for further investigation of the biological functions of SAC genes. All the GhSAC genes were mapped to the G. hirsutum genome chromosomes according to approximate position information. MCScanX software (http://chibba.pgml.uga.edu/mcscan2/) was used to do synteny analysis between GhSAC genes and GrSAC genes and GaSAC genes. The local blast + software was used to perform the BLASTP analysis between G. hirsutum and G. raimondii and G. arboreum with the e-value under 1e − 5 . The position of SAC domain-containing genes and the blast output were imported into MCScanX and the Dual Systeny Plotter software to exhibit the synteny relationship. Multiple sequence alignment of SAC domain-containing protein sequences from T. cacao, V. vinifera, G. hirsutum, G. raimondii, G. arboretum and Arabidopsis thaliana were performed using MEGA X with the default parameters. A phylogenetic tree of deduced amino-acid sequences was constructed using the maximum likelihood (ML) method in MEGA X. GhSAC4.2, 5’-CAAATCAGCATTACGGGTCAT-3’ and 5’-ATTGTCAGATCCAAGGGAGC-3’; GhSAC7.1, 5’-CGACAAGGGTGAGAAAATGAAA-3’ and 5’-CAAGATTTGTGTGTGAGCTAATGG-3’; GhSAC9.2, 5’-TCTGATTCCTCTGCGTTGC-3’ and 5’-CCAACTTGGTTAGACAAGCCAT-3’. The qRT-PCR was completed with three biological replicates, each comprising four technical replicates. The relative gene expression levels were calculated based on the 2 −ΔCT method.


Background
Phosphatidylinositol (PI) phosphates, which differ with regard to the presence or absence of phosphate groups on the available 3-, 4-and 5hydroxy positions of the inositol head group [1] , is a major and trace amounts of phospholipids in eukaryotic cells. They are collectively referred to as phosphoinositides (PIs), and exist as seven forms including PI(4)P, PI(3)P, PI(5)P, PI(4,5)P 2 , PI(3,4)P 2 , PI (3,5)P 2 and PI (3,4,5) P 3 , one of which, PI(4,5)P 2 is known to be the precursor of the second messengers inositol 1, 4, 5-triphosphate and diacylglycerol, which are important in the activation of protein kinase C and the release of intracellular calcium [2] . Originally, PIs were thought to play a key role only in second messenger generation. However, a variety of researches about additional functions suggest that PIs are important regulators of a diverse range of cellular functions such as modulation of vesicle tra cking [3][4][5] , cytoskeletal reorganization [6,7] , maintenance of vacuole morphology, activation of proteins [8] , regulation of lipid storage, cell survival and cell proliferation [9][10][11][12] .
PIs are synthesized by kinases and phosphatases which phosphorylate and dephosphorylate PI respectively. Based on the position of the phosphate that they hydrolyze, phosphoinositide phosphatases and inositol polyphosphate phosphatases are traditionally classi ed into four groups named 1-, 3-, 4-, or 5-phosphatase [8] . Among many phosphoinositide phosphatases, 5-phosphatase forms a fairly large family which is ulteriorly classi ed into four types according to their substrate speci city. Except for the type I 5-phosphatases that only use water-soluble inositol polyphosphate as substrates, the other three types are able to hydrolyzing phosphoinositides [8] . Recently, synaptojanin and inositide 5-phosphatases in which identi ed the SAC domain appeared to represent a novel group of phosphoinositide phosphatases. The SAC domain was originally found in the yeast phosphoinositide phosphatase Sac1p, which was identi ed in screens for "suppressor of actin" mutations [13] and suppressors of the defects caused by mutations of the Sec14 PI/phosphatidylcholine transfer protein [14] . Subsequently, the SAC domain was found in several proteins from yeast and animals. The SAC domain-containing proteins divided into two classes based on the speci city of the c-terminal amino acid sequences after the SAC domain [1] . The rst class, which in addition to an N-terminal SAC domain, have all the domains associated with type II phosphoinositide 5-phosphatases, comprises mammalian synaptojanins and yeast Inp51p, Inp52p and Inp53p. The other class is represented by Sac1p and Fig. 4p in which the SAC domain is linked to a C-terminal region without any recognizable domains. This class includes yeast Fig. 4p and Sac1p which is the archetype of the Sac family of phosphatases and a quantity of uncharacterized proteins such as human (Homo sapiens) hSac1, hSac2 and hSac3. The C-terminal regions of the proteins in this class differ from one another and each has its own sequence speci city.
The association with phosphoinositide phosphatase activity with Sac domains had been identi ed. Through the detailed analysis of PI(4,5)P 2 hydrolysed by the 5-phosphatase synaptojanin, Chung suggested that the synaptojanin must exhibit the ability to dephosphorylate 4-phosphate groups [15] . Subsequently, characterization of mammalian synaptojanin and the yeast synaptojanin homologs Inp52p and Inp53p by Guo revealed a second phosphatase activity resides in the N-terminal SAC domain which was demonstrated to exhibit the activity capable of hydrolyzing phosphates from PI(3)P, PI(4)P and PI(3,5)P 2 [16] . It is worth mentioning that the Sac phosphatases do not hydrolyse either PI(3,4)P 2 or PI(4,5)P 2 , which contain adjacent phosphate groups. These appear that the SAC domain predominantly exhibits a lipidspeci c phosphatidylinositol monophosphate phosphatase activity [1] .
The Sac domain is approximately 400 amino acid residues in length and consists of seven highly conserved motifs that are believed to de ne the catalytic and regulatory regions of the phosphatase [1] . The sequence RXNCLDCLDRTN within the sixth motif is proposed to be the catalytic core of the Sac domain phosphatases [1] . The CX 5 R (T/S) motif which is thought to cradle the phosphate moiety is also present in a variety of metal-independent protein and inositide polyphosphate phosphatases [16] . Compare with Sac1p, Inp52p and Inp53p, Inp51p contains an incomplete CX 5 R (T/S) motif does not exhibit phosphatase activity. Furthermore, the rst conserved Asp residue is mutated into the RXNCXDCLDRTN sequence of the yeast sac1-sac8 and sac1-sac22 mutant alleles [17] , which are thought to be the cause of the lack of phosphatase activity also indicated that the RXNCXDCLDRTN motif could well represent the catalytic core of the Sac phosphatases.
The cellular functions of Sac domain-containing proteins have been characterized, in particular Sac1p. Sac1p is an integral membrane protein [18] and plays an important role in ATP transport speci cally in the endoplasmic reticulum [19] in which is Sac1p primarily localized [20][21][22] . Mutational analysis has demonstrated that Sac1p functions primarily to hydrolyze phosphate group from PI(4)P in vivo. Numerous researches about mutations of Sac1p indicated that Sac1p is involved in vesicle formation and transport [23][24][25] , Golgi function, vacuole morphology [19] and actin cytoskeleton organization [1] . Figure 4p, the other yeast Sac domain-containing protein, in addition to showing 5phosphatase activity [26] was required for the proper actin organization and cellular morphogenesis during the mating [27] . Inp5-phosphatases (Inp51p, Inp52p and Inp53p) overlap with each other while retaining some unique functions. Inp51p, as same as Inp52p, is clearly involved in endocytosis and regulation of the actin cytoskeleton under conditions of normal vegetative growth [1] . Except for Inp51p which exhibits only PI(4,5)P 2 5-phosphatase activity, the others are in a position to be able to convert all of these PIs found in yeast into PI. As for Inp53p, Chang proposed that the protein may possess Golgi-to-vacuolar tra cking [28] . Moreover, several SAC domain-containing proteins from animals have been demonstrated to exhibit phosphoinositide phosphatase activities in vitro, but their cellular functions remain unknown [29][30] .
Except PI(3,4,5)P 3 , all phosphoinositides have been identi ed with plant cells. Some studies had suggested that phosphoinositides are involved in many important cellular activities such as osmotic regulation [31] , plant defense response [32] , vesicle tra cking [33,34] , pollen tube growth [35] , and responses to stress and hormonal treatments [36][37][38][39][40] . However, much less is known about phosphoinositide phosphatases in plants. Although SAC phosphatases are essential regulators of PI-signaling network, little study has described regarding them and their possible biochemical and cellular functions in plants. In Arabidopsis, truncated AtSAC1 has been proved to cause defects in cell morphogenesis and cell wall synthesis [41] . Gene expression analysis demonstrated that AtSAC6 was predominantly expressed in the owers and the expression was highly induced by salinity [42] . AtSAC7 has been shown to be involved in root hairs growth [43] . Moreover, AtSAC2-AtSAC5 have been characterized as an unknown subgroup of tonoplast-associated enzymes, was recently found to be involved in vacuolar morphology.
To characterize the molecular biology and evolution of the cotton SAC family and to understand its possible functions, it is necessary to identify its members and determine their expression patterns. In this report, we show that G.hirsutum genome contains 24 SAC domaincontaining proteins, all of which belong to the class of Sac1p-like SAC proteins. We analyzed their gene structures, chromosomal locations, evolutionary relationships and expression patterns. Present analysis data shows that the GhSAC proteins fall into three subgroups based on their sequence homology and phylogenetic relationship. This is the rst study to undertake a genome-wide analysis of GhSACs. These results provide valuable information on SAC genes in G. hirsutum and supply a framework to further studies to better understand the potential functions of SAC genes in cotton plants.

Results
Identi cation of SAC domain-containing proteins HMMER searched was performed against the T. cacao, V. vinifera, G. hirsutum, G. raimondii and G. arboreum protein databases with SACdomain PF02383 as a query. As a result, 6, 6, 29, 17, 13 putative SAC genes were identi ed initially. Meanwhile, all Arabidopsis SAC protein sequences were used as queries for TBLASTN. We checked all the sequences by Interpro online tool to search the SAC domain.
Ultimately,6 6 24 12 12 SAC domain-contained proteins were identi ed from the above ve genomes respectively. All SAC genes in G.
hirsutum are designated as GhSAC and named according to the order of the closest orthologues in Arabidopsis [44] . The accession number, chromosome distribution, protein molecular weight and length of the GhSAC genes were listed in Table 1. By comparison of number of genes in the three closely related species, SAC gene family members in G. hirsutum showed an obvious expansion of number of genes.  Table 2 The cis-element analysis of GhSACs promoters We constructed a phylogenetic tree from a multiple alignment of SAC protein sequences, comprising 6 TcSACs from T. cacao, 10 VviSACs from V. vinifera, 12 GaSACs from G. arboreum, and 9 AtSACs from Arabidopsis. The phylogenetic analysis revealed evolutionary origin for these genes as well as more recent duplications. The SAC proteins were clustered into three groups (Fig. 1), as previously suggested [42] .
Genes from these species are found in all three groups, suggesting that the higher plant species have at least one gene in each of the three groups.
Our phylogenetic reconstruction showed that the SAC family in cotton diversi ed after the common ancestor of cotton and Arabidopsis because SAC genes of group I and group II in G. arboretum were obviously more than in Arabidopsis. And most of the SAC proteins from the diploids had orthologs in the allotetraploid G. hirsutum, which derived from a hybridization of A group and D group genome ancestors (Additional le 1). The short branches separating the paralogs suggested that the hybridization event occurred relatively recently [45] .

Chromosome Localization And Synteny Analysis Of Sac Genes
To determine chromosome distribution and gene duplication of the SAC genes,all the SAC genes in G. hirsutum were mapped to approximate chromosome positions (Fig. 2). These twenty-ve GhSAC genes were distributed among the 17 chromosomes unevenly. Except for A1, A3, A8, A11, A12, D1, D8, D11 and D12, all chromosomes harbor at least one of the SAC genes. 12 and 12 SAC genes were found to located at the Asubgenome and D-subgenome respectively.
To further infer the phylogenetic mechanisms of SAC family, we constructed syntenic maps of T. cacao with G. raimondii and V. vinifera (Fig. 3). A total of 7 GrSACs and 4 TcSACs genes showed syntenic relationship with those in T. cacao and V. vinifera, respectively. TcSAC2 and TcSAC4 were found to be associated with more than one syntenic gene pairs between G. raimondii and T.cacao SAC genes, guessed that these genes may have played an important role of SAC gene family during evolution. In addition, VviSAC9/TcSAC9 gene pair identi ed between T.cacao and V. vinifera were not found between G. raimondii and T. cacao, which may indicate that this orthologous pair lost after the divergence of G. raimondii and T.cacao from their ancestors.

Gene structures and conserved domain of GhSACs
Gene structure analysis is important for studying genetic evolution. First, we mapped the domain structure by IBS software(version v1.0) (Fig. 4). Then, to understand the evolutionary relationship of SAC protein in G. hirsutum, we constructed the unrooted tree based on the alignments of full-length SAC protein sequences using MJ method of MEGA X. The 25 SAC proteins in G. hirsuhum were divided into three distinct groups (from I to III). Group I consist of the maximum number 14 of GhSACs, while group III contains only four GhSACs. The genomic sequence of the GhSACs genes ranged from 4195 bp to about 17 kb. To obtain further gene structure information, we compared the coding sequence with the genomic sequence of all GhSAC genes (Fig. 5). Different introns (from 6 to 19) were observed among the GhSAC genes. The genes possess maximum number of introns were in group II. The GhSAC proteins gene clusters that were divided into the same group exhibited similar structure. We used MEME to detect conserved motif in the GhSAC family. There were some differences between the groups. 20 conserved motifs were scattered among each GhSAC family (Fig. 5). All of the GhSAC proteins shared the same three motifs: M1, M2 and M3 these motifs together compose the SAC domain which was characteristic for all GhSAC family members.
The SAC domains of SAC proteins yeast and animal proteins are approximately 400 amino acids in length and consists of seven highly conserved motifs which appear to important for the phosphatase activities [1] . To examine in detail the motif organization of the SAC domains of the GhSAC proteins, we compared the SAC domain sequences between Sac1p and the GhSAC proteins and created the seven conserved motifs by the Weblogo online tools (Fig. 6A). Meanwhile, characteristic transmembrane motifs which followed by SAC domains in GhSAC proteins of Group II except GhSAC6.1A were also created. (Fig. 6B) Sequence analysis showed that the GhSAC proteins except Group III contain all seven conserved motifs found in Sac1p (Additional le 3).
The sixth conserved region contains a highly conserved CX 5 R(T/S) motif, which was identi ed as the catalytic motif in many metalindependent proteins and inositide polyphosphate phosphatases in previous reports. However, the putative catalytic core sequence RXNCXDCLDRTN located in motif VI is completely conserved among the GhSAC proteins (except these in Group III). This result suggests that GhSAC proteins may have SAC domain functions similar to those of yeast and animals.
In addition, we found that SAC proteins in subgroup III seemed to lack motif VII. However, in their place is a putative WW domain. WW domains have been shown to be involved in protein-protein interactions by recognizing Pro-containing ligands [46] , and they are considered to be the smallest protein domain involved in protein-protein interactions. The WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded beta-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins [47][48][49] . The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. It is frequently associated with

Cis-element analysis in the promoter regions of GhSAC genes
To identify the putative cis-acting regulatory elements, 2000 bp of sequence upstream from the start codon was isolated. Ultimately, we identi ed 44 different regulatory elements which divided into two main types: light responsive elements and hormone responsive elements from the promoter regions of GhSACs. (Table. 2) Light responsive elements, including Box 4, G-Box, GT1-motif, GATA-motif and MRE, were enriched in the upstream promoter regions of GhSAC genes. Box 4, part of a conserved DNA module involved in light responsiveness, was the most abundant light responsive element in the promoters of GhSAC genes. The genes, except GhSAC6.1A, contained at least one Box 4 element. In addition, 19 members contained a G-Box element, 17 members contained a GT1-motif element, whereas 15 members contained a GATA-motif element. Then, we hypothesized that light could induce the expression of GhSAC genes through their responsive cis-acting elements, further regulating the balance between reproductive and vegetative growth.
The other important type of cis-acting elements in the upstream regions of GhSAC genes are plant hormone-responsive elements. In total, nine types of elements were found that respond to ve respective kinds of plant hormones. These regulatory elements included ABAresponsive elements (ABREs), MeJA-responsive elements(TGACG-motifs and CGTCA-motifs), salicylic acid responsive elements(TCAelements), auxin-responsive element(TGA-elements). This indicates that GhSAC genes may respond to ABA, SA and JA.

Expression pro le of GhSACs
To understand expression patterns of these 25 GhSAC genes in G. hirsutum, we used publicly available transcriptome data to assess the expression of different tissues and organs. The analysis (Fig. 7) revealed that four GhSAC genes (GhSAC2.1A/GhSAC2.1D/GhSAC4.2A/GhSAC4.2D) predominantly expressed in owers, whereas the expression of other genes was not signi cantly altered in different tissues and two genes (GhSAC2.2A/GhSAC2.2D) were not expressed in all tissues and organs. In addition, the expression of GhSAC genes were not signi cantly altered under different abiotic stresses conditions, i.e. cold, heat, salt and drought (Addition le 6). We also performed RT-PCR to con rm the expression levels of four GhSACs in different tissues, including roots, stems, leaves, bracts, sepals, receptacles, petals, pistils, anthers. There was very high sequence similarity within these GhSACs CDSs of A-subgenome and Dsubgenome, so primers were designed to detect the transcription levels of genes both in A-and D-subgenome. As shown in Fig. 8, GhSAC2.1 and GhSAC4.2 genes were predominantly expressed in stigmas and stamens with little expression in other organs while GhSAC7.1 and GhSAC9.2 were expressed in all organs examined. All these genes had a relatively lower level of expression in roots, stems and leaves. These results suggest that GhSAC genes have diverse expression patterns and some genes may play dominant roles in particular organs.

Discussion
With the increasing research in genomes, comparative genomics methods are used to study gene families, which is one of the hot research topics for several species. The SAC domain-containing protein gene was rst identi ed in the yeast (Saccharomyces cerevisiae) named Sac1p phosphoinositide phosphatase protein. Although several other SAC domain-containing proteins from animals possess phosphoinositide phosphatase activities in vitro, their cellular functions remain unknown, in addition, much less is researched about these in plants [29,30] . 9 SAC genes are identi ed in Arabidopsis [42] , ve in yeast [1] and ve in human beings [30] , however, the G. hirsutum genome has 24 members of the SAC gene family which is obviously much more than above. Zhong reported a genome-wide analysis of the SAC gene family members in Arabidopsis [42] . They discussed the number, classi cation, structure of genes and presented a basic analysis of the conserved motifs in SAC proteins.
In this study, we identi ed 24 SAC genes in the G. hirsutum genome, where 12 genes belong to the A subgenome and 12 genes to the D subgenome. Compared with other plants SACs (9 SAC family genes have been identi ed in Arabidopsis, 6 in T. cacao, 6 in V. vinifera, 12 in A group and 12 in D group), the GhSAC family is the largest with 24 phylogenetically expanded genes. The striking expansion and diversi cation of the GhSAC family genes probably suggests that these SACs play crucial roles in the physiological maintenance in G. hirsutum, which are same as Sac1p in yeast. We also noticed that SAC genes in AD genome were equaled to the sum of these in A genome and D genome. This result may be associated with the gene duplications in the evolution of AD genome from their diploid ancestors.
Although genes within a family evolve from multiple mechanisms, a comprehensive phylogenetic and structural analysis can offer insight into the evolutionary origins of, and relationships among, different isoforms [50] . Based on previous sequence similarities and phylogenetic relationship analyses [42] , the AtSAC proteins have been divided into three subgroups. Our phylogenetic analysis of Arabidopsis, T. cacao, V. vinifera and cotton genes corroborated this classi cation and inferred that higher plant species have at least one gene in each of the three groups.
The existing research ndings have demonstrated that the SAC domains of several proteins from yeast and human exhibit different speci cities toward different phosphoinositides. For example, Sac1p, which contain the SAC domains, exhibit a broader-speci city phosphatase activity capable of hydrolysing phosphate from PI(3)P, PI(4)P, and PI(3,5)P 2 [16][29] [51] , whereas hSac2 possessed a 5phosphatase activity toward PI(4,5)P 2 and PI(3,4,5) P 3 [30] . In plant cells, six forms of phosphoinositides have been detected [42] . Because GhSACs except these in subgroup III contain all seven conserved motifs, which believed to be important for the phosphatase activities of yeast and animal SAC proteins, we can speculate that GhSACs may function as phosphoinositide phosphatases. Moreover, the facts that the G. hirsutum genome contains 24 SAC genes belonging to three subgroups and suggest that different GhSACs might possess different substrate speci cities, and, therefore, they may regulate the metabolism of different phosphoinositides in the phosphoinositide pool, which in turn in uences diverse cellular processes [42] . De nite proof of such an activity awaits the biochemical and functional characterization of the GhSAC proteins.
Gene expression pattern can provide important clues about gene functions, which are believed to be associated with divergence in the promoter region [52] . Cis-acting regulatory elements contained in gene' promoter regions play key roles in conferring the developmental regulation of gene expression. A total of 20 different types of light responsive element were identi ed in the promoter regions of GhSAC gene family via cis-element analysis. We found that light responsive elements were abundant in the promoters of GhSACs in each group, while the number of light responsive elements in the promoter regions of GhSACs in group I varies greatly, with the maximum of 19 and the minimum of only 2, which did not occur in the other groups. Therefore, we speculated that the GhSAC family genes were generally sensitive to light, whereas GhSACs respond to light differently. In addition, plant hormone-responsive elements were enriched in the upstream promoter regions of GhSAC genes in group I and group III. ABA-responsive elements and MeJA-responsive elements were the most abundant cis-acting hormone responsive elements in the promoters of GhSAC genes. This indicated that GhSAC genes in group I and group III may be sensitive to ABA and JA than genes in group II.
Gene expression analysis suggests that different GhSACs may play speci c roles in particular organs or tissues. It is apparent that all GhSAC genes are none expressed in leaves and four genes (GhSAC2.1A/GhSAC2.1D/GhSAC4.2A/GhSAC4.2D) are predominantly expressed in owers, suggesting that these proteins may play mainly roles in owers. The other GhSAC genes showed overlapping expression pro les, and expressed in different organs and tissues expect leaves without apparently differences. Further investigation on each GhSAC protein in distinct organs and tissues that will bene t to understand GhSAC proteins in plants while their growth and development. Previous reports about SACs in Arabidopsis show that AtSAC6 protein may play a role mainly in owers. However, it is intriguing to discover that these four genes belong to subgroup I rather sac6 belongs to subgroup II. Yet GhSACs in subgroup II did not exhibit differential expression patterns of different organs or tissues.

Conclusion
By genome wide analysis of SAC-domain containing genes in G. hirsutum, 24 GhSAC genes were identi ed. The GhSAC proteins were classi ed into three different subgroups and showed clear orthologous relationships of SAC members of Arabidopsis, G. arboreum and G. raimondii. Our expression analysis shows that GhSAC2.2A, GhSAC2.2D, GhSAC4.2A and GhSAC4.2D are predominantly expressed in owers. These proteins may play a role mainly in owers. The present genomic and bioinformatics analyses of GhSAC genesis study provide a solid foundation for further investigation of the cellular functions of GhSAC genes.
Previous result has shown that nine SAC proteins exist in Arabidopsis [42] . Secondly, Arabidopsis SAC domain-containing protein sequences were downloaded from TAIR (http://www.arabidopsis.org/) to use as query to perform the BLASTP against T. cacao, V. vinifera, G. hirsutum, G. raimondii and G. arboreum genome, respectively. Then, all these sequences were submitted and checked by Interpro (http://www.ebi.ac.uk/interpro) to exclude the sequences without complete SAC-domain.
Chromosomal location, synteny and phylogenetic analysis of SACs All the GhSAC genes were mapped to the G. hirsutum genome chromosomes according to approximate position information. MCScanX software (http://chibba.pgml.uga.edu/mcscan2/) was used to do synteny analysis between GhSAC genes and GrSAC genes and GaSAC genes. The local blast + software was used to perform the BLASTP analysis between G. hirsutum and G. raimondii and G. arboreum with the e-value under 1e − 5 . The position of SAC domain-containing genes and the blast output were imported into MCScanX and the Dual Systeny Plotter software to exhibit the synteny relationship. Multiple sequence alignment of SAC domain-containing protein sequences from T. cacao, V. vinifera, G. hirsutum, G. raimondii, G. arboretum and Arabidopsis thaliana were performed using MEGA X with the default parameters. A phylogenetic tree of deduced amino-acid sequences was constructed using the maximum likelihood (ML) method in MEGA X. Structural information on the SAC genes, including chromosomal location and gene length, were obtained from the Phytozome, Cotton Omics Database and CottonGen databases. The domain structures were created by IBS software (version v1.0) and sequence logos were created using Weblogo online software (http://weblogo.threeplusone.com/). The exon/intron structure of each GhSAC gene was displayed in Gene Structure Display Server program (http://gsds.cbi.pku.edu.cn/index.php) by comparing the coding sequence and genomic sequence. The conserved motifs prediction was performed using the MEME (http://meme-suite.org/) online program with the following parameters: number of unique motifs: 20; and maximum and minimum search widths: 50 and 6, respectively.

Retrieval And Analysis Of Promoter Sequences
The G. hirsutum genome sequences were used to retrieve the promoter sequences (2 kb upstream of the start codon) of the GhSAC genes. The analysis of the GhSAC promoters was carried out using the Plant-CARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) [53] .

Expression Pro les of GhSAC genes
The expression levels of GhSAC genes containing in different organs or tissues and under different stresses (cold, heat, salt and drought), which were downloaded from the Cotton Omics Database (COD) (http://cotton.zju.edu.cn/).

Con icting of Interests
The authors declare no con ict of interest.