Evolutionary Analysis of the Steroidogenic Acute Regulatory Protein-Related Lipid Transfer Domain and Its Response to Salt Stress In Vitis Vinifera

This study aimed to enhance the understanding of the steroidogenic acute regulatory protein-related lipid transfer (START) domain in Vitis vinifera. A total of 23 members of the VvSTARD gene family were found, which could be divided into ve groups. The analyses of the gene codon preference, selective pressure, and tandem replication events of the VvSTARD, AtSTARD, and OsSTARD genomes indicated that tandem replication events occured in grapes, Arabidopsis, and rice genomes. Eight lipid transporter proteins were found in the tertiary structure of the STARD gene family in grapes. The analysis of the expression proles of the three species microarrays showed that the expression sites of the STARD gene and the response to abiotic stress in the same subgroup had similar characteristics. In addition, quantitative real-time polymerase chain reaction (qRT-PCR) was used to analyze the expression of the STARD gene family in grape leaves in response to different hormones and abiotic stresses, and the obtained results were the same as those predicted by the cis-elements and the expression proles. Furthermore, 35S:STARD5:EGFP was successfully constructed to verify the subcellular prediction results, and the results showed that STARD5 was located in the nucleus. Through the identication of salt tolerance of transgenic tomato, STARD5 was found to regulate the salt stress of plants. Collectively, these data indicated that the VvSTARD gene family plays an important role in response to salt stress. Twenty-three members of the STARD gene family were identied in grapes. In addition, “Micro Tom” tomato overexpressing STARD5 from Vitis vinifera increased the tolerance of salt stress. stress in plants are limited. Moreover, the function of the PH–START or START domains proteins in plants, particularly grapes, while facing abiotic stress is still not understood. Therefore, this study has focused on identifying the STARD gene family and verifying the tolerance of the members of this family to salt stress in grapes. The phylogenetic tree, intragenomic and extragenomic tandem repeat events, selective pressure, codon preference is analyzed to predict the evolutionary relationship amongst grapes, Arabidopsis, and rice. Quantitative real-time polymerase chain reaction (qRT-PCR) is conducted on 23 identied VvSTARD gene families to verify their expression in grapes in response to different hormones and abiotic stresses. RNA is extracted from “Pinot Noir” grape leaves, and STARD5 is amplied to verify the tolerance of the family to salt stress. The STARD5 is used for the subcellular localization of Arabidopsis protoplasts and the genetic transformation of “Micro Tom” tomato plants. These ndings will lay a solid foundation for further investigations into the molecular mechanism of the STARD gene in grape salt stress resistance. VvSTARD14, VvSTARD17, VvSTARD18, VvSTARD20, and VvSTARD23), 2 (VvSTARD16, and VvSTARD19), 1 (VvSTARD12), 4 (VvSTARD10, VvSTARD16, VvSTARD17, and VvSTARD22), 2 (VvSTARD17, and VvSTARD20), 1 (VvSTARD20) and 6 (VvSTARD9, VvSTARD10, VvSTARD11, VvSTARD13, VvSTARD18, and VvSTARD19) proteins were observed in the cytoplasm, plasma membrane, cytoskeleton, mitochondria, extracellular matrix, Golgi apparatus, and vacuole, respectively. These results indicated that the STARD transcription factor gene family was a relatively conserved gene family.


Introduction
In agricultural production, studying the molecular mechanism and the function of stress-related genes is critically important to improve the crop quality and yield. Salt stress is an important constraint on the crop quality and yield particularly in grapes. Although many studies have reported on the mechanism of plant salt tolerance (Cheong et al. 2003;Shi et al. 2003;Cao et al. 2007), numerous genes have not been excavated and studied yet.
The steroidogenic acute regulatory protein-related lipid transfer (START) domain, which was rst discovered in mammals, has a 210-amino-acid conserved sequence, which forms an α/β helix-grip structure, thereby creating a hydrophobic cavity that binds to the ligand and small globular modules (Roderick et (Ingram et al. 2000;Nakamura et al. 2006). The HD-Zip IV gene OsHDG11 can improve the drought tolerance and increase the grain yield of transgenic rice plants (Yu et al. 2013). Promoter analysis shows that the HD-Zip III genes may be involved in responses to light, hormones, abiotic stressors, and stem development of the HD-Zip family, but this analysis fails to verify the function of such genes (Li et al. 2019).
Although the HD-Zip gene family has been studied (Li et al. 2017a), studies on HD-Zip III and HD-Zip IV containing the START domain, which focus on plant response to various abiotic stresses, are few. In addition, genes that only contain the START domain in grapes have not been reported. Furthermore, previous research has reported that the START domain associates with the pleckstrin homology (PH) domain at the same site used for the PH domain membrane binding and confers the functional regulation of the ceramide transfer (CERT) protein (Prashek et al. 2017). The EDR2 gene was identi ed, which may serve as an important entry point for understanding the function of plant PH and START domains and possible links amongst lipid signaling, the mitochondria and the activation of programmed cell death (Nie et al. 2011). The EDR2 gene is associated with the regulation of plant defense responses in Arabidopsis (Tang et al. 2005;Nie et al. 2011). The AtAPOSTART1 is an Arabidopsis PH-START domain protein involved in seed germination (Resentini et al. 2014). Nevertheless, studies on genes containing the PH-START domain in grapes abiotic stress are not available.
HD-Zip III and HD-Zip IV containing the START domain proteins in plants have been widely examined, but studies on the resistance of such proteins to abiotic stress in plants are limited. Moreover, the function of the PH-START or START domains proteins in plants, particularly grapes, while facing abiotic stress is still not understood. Therefore, this study has focused on identifying the STARD gene family and verifying the tolerance of the members of this family to salt stress in grapes. The phylogenetic tree, intragenomic and extragenomic tandem repeat events, selective pressure, codon preference is analyzed to predict the evolutionary relationship amongst grapes, Arabidopsis, and rice. Quantitative real-time polymerase chain reaction (qRT-PCR) is conducted on 23 identi ed VvSTARD gene families to verify their expression in grapes in response to different hormones and abiotic stresses. RNA is extracted from "Pinot Noir" grape leaves, and STARD5 is ampli ed to verify the tolerance of the family to salt stress. The STARD5 is used for the subcellular localization of Arabidopsis protoplasts and the genetic transformation of "Micro Tom" tomato plants. These ndings will lay a solid foundation for further investigations into the molecular mechanism of the STARD gene in grape salt stress resistance.
The START conserved domain was used as queries to perform the BLASTP analysis (E < 10 −10 ). HMMER (https://www.ebi.ac.uk/Tools/hmmer/), and Pfam (http://pfam.xfam.org/) (Potter et al. 2018; El-Gebali et al. 2019) were used to con rm the sequence accuracy. Genes without the START domain were removed, and VvSTARD genes were identi ed (Fig. S1). A total of 23 STARD genes were obtained from the grape gene database and named in accordance with the conserved domains and the position of the genes on the chromosome (Table 1). Simultaneously, Arabidopsis, and rice STARD genes were also named in the same way (Table S1). The physicochemical properties of the VvSTARD protein, such as molecular weight (MW), isoelectric point (pI), grand average of hydropathicity (GRAVY), aliphatic index and instability index, were obtained from the ExPASy (https://www.expasy.org/) (Wilkins et al. 1999).
Phylogenetic clustering, and gene structural and protein conserved motif analysis The multiple sequence alignment of the STARD genes of Arabidopsis, rice, and grapes was conducted using the ClustalX 2.0 (Conway Institute, University College Dublin, Dublin, UK) (Larkin et al. 2007). MEGA 7.0 (Pennsylvania State University, State College, PA, USA) was used to perform phylogenetic clustering (Kumar et al. 2016) with the NJ, and the "Poisson model" was adopted. The gap was set to "complete deletion," and the check parameter was bootstrap = 1000 times with random seed. GSDS 2.0 (http://gsds.cbi.pku.edu.cn/) was used to analyze gene structures, namely, exon and intron (Hu et al. 2015). MEME online software (http://meme-suite.org/) was used to predict the conserved domain of the protein (Bailey et al. 2009), and the number of motifs in the conserved domain was set to 20.

Analysis of the STARD gene duplication and the Ka/Ks in grapes
For the synteny analysis, the MCScanX algorithm was used to detect the synteny or the collinearity (Wang et al. 2012), and the diagram was drawn via TBtools (Chen et al. 2018). The nonsynonymous/synonymous (Ka/Ks) values of duplicate gene pairs or triplicate gene groups (between any two genes in one triplicate gene group) were calculated through DnaSP 6.0, an application released by Universitat de Barcelona.

Codon usage bias analysis
The codon bias refers to the unequal use of synonymous codons for an amino acid (Hershberg et Wang et al. 2018). The coding sequences of the STARD gene were used to determine the codon adaptation index (CAI), codon bias index (CBI), frequency of optimal codons (FOPs), relative synonymous codon usage (RSCU), GC content and GC content at the third site of the synonymous codon (GC3s content) by using the online software CodonW 1.4.2 (http://codonw.sourceforge.net) (Wang et al. 2018). The R language was used to analyze the correlation amongst the T3s, C3s, A3s, G3s, GC, GC3s, L_sym, L_aa, GRAVY and Aromo. Subcellular localization and secondary and tertiary structure analyses WoLF PSORT (https://wolfpsort.hgc.jp/) was used to predict the subcellular localization of the VvSTARD genes. The NPS@: SOPMA secondary structure prediction (https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_sopma.html) was used for predicting the secondary structure. SWISS-MODEL (http://www.expasy.org/swissmod/) was used to predict the 3D structure of some atypical HDs, and 3D structure gures were prepared using PyMOL software (DeLano 2002, The PyMOL molecular graphics system. http://www.pymol.org).

cis-Element and expression analyzes of STARD genes in grapes
The promoter sequence of the 2 000 bp upstream of the coding region of VvSTARD genes was obtained from the website of grape genomes, and the PlantCARE online site was used to predict and analyze the gene promoter elements (Lescot et al. 2002;Wang et al. 2016). cis-Element diagrams were constructed via GSDS2.0 (http://gsds.cbi.pku.edu.cn/) (Hu et al. 2015). Expression data were revitalized from GEO databases (Affymetrix GeneChip 16K Vitis vinifera Genome Array) (Wang et al. 2018), and selected data on the "Cabernet Sauvignon" grape under different abiotic stresses (accession number: GSE31594) were downloaded from the GEO database (http://www.ncbi.nlm.nih.gov/geo/). The expression data of STARD genes were extracted from grapes, and tissue expression data were retrieved from the Bio-Analytic Resource for Plant Biology (BAR, https://bar.utoronto.ca/) databases in grape, Arabidopsis and rice. In addition, stress expression data were retrieved from the BAR databases in Arabidopsis and rice. Heat maps were drawn in accordance with TBtools (Chen et al. 2018).
Plant materials, treatments,and RNA isolation The V. vinifera "Pinot Noir" tube seedling was used in qRT-PCR and cultured in the Fruit Tree Physiology and Biotechnology Laboratory of Gansu Agricultural University. The single-shoot stem segments of the test tube seedlings were attached to a solid GS (modi ed B5 solid medium) and cultured under white LED for 35 days. The grape seedings were treated with 0.2 mmol l −1 of abscisic acid (ABA), 150 μmol l −1 of methyl jasmonate (MeJA), 50 mg l −1 of salicylic acid (SA), 100 μmol l −1 of indole acetic acid (IAA), 50 mg l −1 of gibberellin 3 (GA3), 10% PEG6000, and 400 mmol l −1 of NaCl at low temperature (4 °C) for 12 and in liquid N 2 , and stored at −80 °C for RNA extraction and gene expression.
Cotyledons of newborn "Micro Tom" tomato were used to transform the STARD5 gene, and young seedings of 3 weeks were used for the salt tolerance assay.
For the salt stress assay, the transgenic tomato was watered every 3 h with 400 mmol l −1 of NaCl, and the control was supplemented with the same volume of distilled water. Three biological replicates for each treatment and fresh sample leaves of tomato (0.1 g) were collected. The relative electrical conductivity and proline and malondialdehyde contents of tomato leaves were determined using the commercial ELISA kit (Jiangsu Keming Biotechnology Institute, Suzhou, China) in accordance with the manufacturer's protocol.
The spectrum plant total RNA kit (Sigma, St. Louis, MO, USA) was utilized to extract the RNA. The M-MLV Reverse Transcriptase (RNase H−) kit (Takara Bio, Inc., Japan) was utilized for the synthesis of the reverse-strand complementary DNA (cDNA). The puri ed total RNA (0.5-2 μg) was reverse transcribed into the rst-strand cDNA and used for qRT-PCR. Subsequently, the TaKaRa SYBR Premix Ex Taq. II (Takara Bio, Inc., Japan) was used for qRT-PCR (Light Cycler 96 Real-Time PCR System, Roche, Basel, Switzerland). The cycling parameters were 95 °C for 30 s, 40 cycles at 95 °C for 5 s, and 60 °C for 30 s. For melting curve analysis, a program consisting of 95 °C for 15 s followed by a constant increase from 60 °C to 95 °C, was included following the PCR cycles. VvGAPDH (GenBank accession no. CB973647) and SlActin (GenBank accession no. NM_001330119) were used as internal reference genes. The primer sequence is presented in Table S2. The relative expression levels of the genes were calculated using the 2 −ΔΔCT method (Willems et al. 2008), and images were drawn using the Origin 9.0 software.

Subcellular localization and identi cation of the heterologous expression of STARD5
Green uorescent protein (EGFP) fusion vectors containing STARD5 fused to the N-terminal of EGFP driven by the 35S promoter were constructed to investigate the subcellular localization of STARD5. The coding sequences of STARD5 were ampli ed and inserted into pBI221-EGFP by using the NovoRec®PCR One Step Cloning Kit (Novoprotein Scienti c Inc., China). Constructs were transferred to Arabidopsis protoplasts, and the EGFP uorescence was detected using confocal laser-scanning microscopy (Olympus FV1000 Viewer, Tokyo, Japan). Arabidopsis protoplasts were prepared in accordance with the method of Yoo et al. (2007).
"Micro Tom" Tomato was used for the transformation of the STARD5. The complete coding regions of STARD5 were inserted behind the 35S promoter and constructed 35S:STARD5:FLAG plasmids that were introduced into the Agrobacterium strain GV3101. The Agrobacterium-mediated transformation of the "Micro Tom" leaves was performed as previously described (Ruf et al, 2001). The genomic DNA was extracted using the TransDirect Plant Tissue PCR Kit (Beijing Quantising Biotechnology Co., Ltd.), and positive plants were detected using gene-speci c primers (35S-F: 5′-TGACGCACAATCCCACTATC-3′; STARD5-R: 5′-CGATGGTAGCGCTTCTTCTT-3′).

Statistical analysis
Data obtained from the qRT-PCR of three biological replicates were subjected to two-way ANOVA and Bonferroni's post-test for data comparison. Data analysis was conducted using the IBM SPSS v.22 (IBM, Armonk, NY, USA). P < 0.05 indicated a signi cant difference, which was determined on the basis of the Duncan method. In graphs, notable differences were marked using different letters (a-f). Other data analysis methods were added in the corresponding gure and table captions.

Identi cation of the STARD genes in grapes
The BLASTP was used to search for the grape STARD proteins by utilising the START domain homologous sequence in Arabidopsis as a standard, and multiple sequence alignments was performed by DNAMAN to remove redundant sequences. A total of 23 STARD candidate genes were observed on the grape genome (12X) database from this research. VvSTARD1-VvSTARD23, which were named on the basis of the order of their gene and conserved domains (Table  1), were disseminated broadly on 12 chromosomes. The largest distribution was established on the second chromosome, and only one gene was located on the 5th, 6th, 11th, 16th, and 17th chromosomes. One gene distributed on the 9th chromosome, whereas ve gene distributed on the 4th, 10th, 12th, 13th, and 15th chromosomes. The CDS coding sequences of the START domain in grapes encoding 237-886 amino acids ranged from 714 bp (VvSTARD23) to 2658 bp (VvSTARD7). The MW of VvSTARD ranged from 26.77 kD (VvSTARD23) to 99.56 kD (VvSTARD7), showing large differences. VvSTARD proteins had hydrophilic values ranging from −0.466 to −0.077. The predicted pI values of the VvSTARD proteins ranging from 5.60 (VvSTARD5) to 9.66 (VvSTARD22). Furthermore, 20 VvSTARD proteins (86.95%) had an instability index greater than 40, indicating that these proteins were stable.
Phylogenetic and structural analyses of the START domain proteins STARD protein sequences were used to construct the phylogenetic tree in grapes, Arabidopsis and rice (Fig. 1A). The phylogenetic distribution showed that the START domain proteins could be divided into ve major subgroups (groups 1-5). Twenty members of the START domain proteins family were included in group 1 (4, 8, and 8 members from grapes, rice, and Arabidopsis, respectively), which contained the structural START and HD domains. Eighteen members were included in group 2 (4, 4, and 10 members from grapes, rice, and Arabidopsis, respectively), which contained the structural START and HD domains. Eighteen members were included in group 3 (5, 8, and 5 members from grape, rice and Arabidopsis, respectively), which contained the structural START, HD and MEKHLA domains. Thirteen members were included in group 4 (4, 2, and 7 members from grapes, rice, and Arabidopsis, respectively), which contained the structural START domain. Fourteen members were included in group 5 (6, 3, and 5 members from grapes, rice, and Arabidopsis, respectively), which contained the structural START, PH and DUF1336 domains. The conserved sequences included START, HD, MEKHLA, PH domains, and DUF1336 sequences of the eight conserved domains (Fig. S1).
Further analysis showed that members from the same subgroups had similar exon/intron structures and motifs. As shown in Fig. 1B, a gene with only one exon in the coding sequence of the entire VvSTARD gene family was not found, and the exon ranged from 5 to 22. Moreover, 20 motifs of VvSTARD proteins (Figs. 1C and S2) were analyzed using the MEME online software to gain insights into the characteristic region of the VvSTARD proteins. Six conserved motifs (motifs 1, 2, 3, 4, 5, and 13) were shared by groups 1, 2, and 3 of the VvSTARD protein family. Six motifs (motifs 8, 9, 11, 15, 17, and 18) were shared by groups 1 and 2. Four motifs (motifs 6, 7, 16, and 19) were shared by group 3, and three motifs (motifs 10, 12, and 14) were shared by group 5. However, no systemconserved motif in the VvSTARD protein family was observed in group 4. In addition, the motif 16 was shared by groups 1 and 2. These results indicated that genes with very similar structures distributed in the same subgroups might have similar biological functions, whereas the genes distributed in different subgroups likely have different biological functions.
Codon preference analysis of VvSTARD, AtSTARD, and OsSTARD genes Codons and related parameters in grapes, Arabidopsis, and rice were obtained and compared to further evaluate the evolutionary relationship of VvSTARD genes. A total of 23 VvSTARD, 35 AtSTARD, and 25 OsSTARD gene families contained 15 989, 24 209, and 31 815 codons, respectively (including stop codons), which had RSCU > 1 codons of 9916, 15 413, and 10 459, respectively. Among the RSCU > 1 codons, those ending in A or U had preferred codons in the grape, and the ArabidopsisSTARD gene families. A total of 2193, 4674, and 3049 codons ending in A, U, and G or C, respectively, were found in grapes, accounting for 22.12%, 47.14%, and 30.74%, respectively, of the total number of codons with RSCU > 1. In Arabidopsis, codons ending with A, U and G or C accounted for 21.83%, 49.45%, and 28.72%, respectively, of the total codons in RSCU > 1. However, rice contained codons ending in G and C, accounting for 43.24% and 46.17%, respectively, of the total codons in RSCU > 1, whereas codons ending in A or U only accounted for 10.59% of the total codons in RSCU > 1 ( Fig.2 and Table S4). Arabidopsis, respectively, none had an Nc value of less than 35. However, among the 25 rice STARD genes, six (OsSTARD5, OsSTARD6, OsSTARD7, OsSTARD8, OsSTARD10, and OsSTARD25) showed an Nc value less than 35. The GC3 values in grapes ranged from 0.33 to 0.54, and the distribution was relatively concentrated. The GC3 values in Arabidopsis ranged from 0.29 to 0.49, and the distribution was relatively concentrated. The GC3 values in rice ranged between 0.37 and 0.94, and the distribution was relatively scattered. These ndings showed that the codon usage preferences of the grape and ArabidopsisSTARD gene families were strong and affected by selective pressure during evolution, whereas those of the grape STARD gene family were weak and affected by the mutation pressure during evolution.
Correlation analysis revealed that the T3s had a negative correlation with C3s, G3s, GC3s, CBI, and Fop and that the C3s had a positive correlation with CBI, Fop, GC, and GC3s in grapes, Arabidopsis, and rice (Fig. 2). These correlations were highly consistent in grapes and Arabidopsis but quite different from those in rice. For instance, the T3s had a positive correlation with Nc in rice, but the T3s had a negative correlation with Nc in grape and Arabidopsis. Nc had a negative correlation with CAI, CBI, and Fop in rice, but Nc had a positive correlation with CAI, CBI, and Fop in grapes and Arabidopsis. Collectively, from the above-mentioned results, the genetic relationship between grapes and Arabidopsis was inferred to be close.

Chromosomal distribution and gene duplication analysis
As shown in Fig. 3A and Table S5-4, VvSTARD genes were unevenly distributed in four linkage groups (chr). The chr6/chr13 linkage group had two VvSTARD gene pairs. chr1, chr3, chr14, chr18, and chr19 had no synteny VvSTARD gene. Gene duplication, through either segmental or tandem duplication, played important roles in the expansion of new members during the evolution of a gene family (Holub 2001). In this study, tandem duplication genes, namely, VvSTARD14/VvSTARD15 and VvSTARD10/VvSTARD13, were discovered on chr6 and chr13, respectively. A pair of collinear genes (VvSTARD6/VvSTARD7) was observed on chr15 and chr16, and another pair (VvSTARD9/VvSTARD11) was found on chr4 and chr9. These results suggested that some VvSTARD genes might be manufactured via gene duplication, and the primary driving force of the VvSTARD evolution was these duplication events.
Three representative comparative systematic maps of Arabidopsis, grapes, and rice were constructed to further forecast the phylogenetic element of the VvSTARD family ( Fig. 3B and Table S5 -5). A total of 13, 14, and 9 STARD genes in grapes, Arabidopsis, and rice showed a collinearity relationship. Amongst these genes, 15 were homologous pairs of the STARD genes in grape and Arabidopsis, and 14 were homologous pairs of the STARD genes in grapes and rice. Some VvSTARD genes particularly the grapes and ArabidopsisSTARD genes were linked with three pairs of synonymous genes, such as VvSTARD7, which might play a critical role in the evolution of the STARD gene family. Some STARD collinear gene pairs found between grapes and Arabidopsis were settled on highly conserved synonymous blocks. The phylogenetic relationship and codon preference analyses demonstrated that the evolutionary relationship between grapes and Arabidopsis might be close.
The modes of selection could be estimated using the ratio of the number of nonsynonymous substitutions per nonsynonymous site (Ka) to the number of synonymous substitutions per synonymous site (Ks). Ka/Ks > 1 indicated positive selection; Ka/Ks < 1 indicated purifying selection and Ka/Ks = 1 indicated neural evolution (Yang, 2007). The Ka/Ks ratios of the STARD gene pairs of grapes, Arabidopsis, and rice were calculated to further understand the evolutionary relationship of the VvSTARD gene family (Fig. 4 and Table S5-6, S5-7, and S5-8). A total of 202 homologous gene pairs were found in the grape STARD gene family (Fig. 4). A total of 79 pairs had Ka/Ks > 1, and 123 pairs had Ka/Ks < 1. A total of 382 homologous gene pairs were found in the ArabidopsisSTARD gene family (Fig. 4). A total of 161 pairs had Ka/Ks > 1, and 221 pairs had Ka/Ks < 1. A total of 260 homologous gene pairs were found in the rice STARD gene family (Fig. 4). A total of 70 pairs had Ka/Ks > 1. One pair (OsSTARD7/OsSTARD1) had Ka/Ks = 1, and 189 pairs had Ka/Ks < 1. These results showed that the grapes, Arabidopsis, and rice STARD gene families might be dominated by puri cation selection during evolution.
The secondary and the tertiary structure analyses showed that MLN64, PCTP, cholesterol-regulated START protein 4, and START protein 5 contained four α helices, of which two α helices (α2 and α3) formed an internal hydrophobic cavity that could hold a ligand molecule (Fig. S3). α4 was visible on the top of the hydrophobic channel, and the α helix at the C-terminus formed the lid. In addition, START protein13 had two α-helices (α1 and α2), and the C-terminal α2 helix served as lid, thereby establishing an internal hydrophobic cavity. BFIT2, CERT and START protein3 contained six α helices. Further research found that START protein 5 contained only one 8-chain antiparallel β-sheet, whereas MLN64, PCTP, BFIT2, CERT, START protein 3, cholesterol-regulated START protein 4, and START protein 13 contained a 9-chain antiparallel β-sheet. The side view showed that the antiparallel β-sheets, that is, β4, β5 and β6 at one end of the hydrophobic cavity formed a basket structure, whereas the β-sheets on the other side, that is, β1, β2, β3, β7, β8, and β9, were formed. These results suggested that the VvSTARD protein played a signi cant role in regulating plant lipid metabolism.
cis-Element and expression pattern analyses of VvSTARD genes cis-acting elements related to the hormone and abiotic stress responses were speculated in the promoter region of the VvSTARD genes. Nine types of hormone-and stress-related cis-acting regulatory elements were presented in the promoters of STARD genes in grapes. (Fig. 5A and Table S8-1). Three stressrelated cis-acting elements, including TC-rich repeats (defense and stress), MBS (drought), and low-temperature-responsive elements, were detected. Six hormone-related cis-acting elements, including TGA element/AuxRR core (auxin), O 2 site (zein metabolism), TCA element (salicylic acid), abscisic acid (ABA)responsive element, GARE-motif/P-box/TATC-box (gibberellin), and CGTCA/TGACG motif (MeJA responsive element), were identi ed. All VvSTARD genes contained cis-acting elements associated with abiotic stress or hormonal responses. Amongst the VvSTARD genes, 14 genes related to the ABA response element were found, and 14 genes were detected in the drought response element. In addition, the VvSTARD gene contained 14 auxins, 10 zeins, 9 GA3, 11 SA, and 13 MeJA-responsive elements. The results showed that the VvSTARD gene could regulate the metabolism of various hormones and abiotic stresses in response to different environmental factors. The expression mode and function of the STARD gene family in plants were not clear. Therefore, the STARD gene expression data for organs/tissues and abiotic stress in grapes, rice, and Arabidopsis were downloaded from the BAR database. Tissue expression analysis indicated that the expression levels of the VvSTARD genes in different tissues at different developmental stages of grapes were uneven.
Analysis of VvSTARD gene family tissues ( Fig. 5B and Table S8-2) demonstrated that the tissue expression of the VvSTARD genes in the same group was similar, but the tissue expression sites differed because of evolutionary differences. VvSTARD4, VvSTARD5, VvSTARD6, and VvSTARD7 were members of the group 1, which contained the HD-START domain. Interestingly, VvSTARD4, VvSTARD5, and VvSTARD6 were expressed in the leaves, seedling, stems, owers, buds, fruits, skin, seed, stamen, petals, pericarp, and carpel. However, the VvSTARD7 was only expressed in the leaves and seed-post fruits. VvSTARD1 and VvSTARD8, which were classi ed into group 2 and contained the HD-START domain, were expressed in the leaves, buds, owers, pollen and seeds. VvSTARD9, VvSTARD10, VvSTARD11, VvSTARD12, and VvSTARD13 belonged to group 3 and contained the HD-START-MEKHLA domain. VvSTARD10 and VvSTARD11 were not expressed in the pollen, seed, esh, rachis, pericarp, and other tissues and organs. VvSTARD9 and VvSTARD12 were detected in the tendrils, leaves, seedling, stems, roots, owers, buds, fruits, and carpels. Nevertheless, VvSTARD13 was extremely lowly expressed or not expressed in many tissues. VvSTARD20, VvSTARD21, VvSTARD22, and VvSTARD23, which were classi ed into group 4 and contained the START domain only, were expressed at different developmental stages of each organ and tissue. VvSTARD14, VvSTARD15, VvSTARD16, VvSTARD17, VvSTARD18, and VvSTARD19 belonged to group 5. VvSTARD14, VvSTARD15, and VvSTARD18 were expressed in other tissues except for seed, petals, seedling and bud winter. The VvSTARD16 was expressed at different developmental stages of each organ and tissue, and VvSTARD17 was downregulated or not expressed in many organs. The VvSTARD19 was upregulated in the pollen, esh midripening, esh ripening, esh, pericarp, and skin. VvSTARD23 was also upregulated in the tendrils, young leaves, seedlings, stalks, owers, carpel, stamen, petals, pollen, seed veraison, esh veraison, skin veraison, and pericarp veraison.
The results of the analysis of the grape abiotic stress expression data ( Fig. 5C and Table S8-3) showed that six genes (VvSTARD1, VvSTARD2, VvSTARD3, VvSTARD5, VvSTARD6, and VvSTARD8) belonged to groups 1 and 2, whereas ve genes (VvSTARD9, VvSTARD10, VvSTARD11, VvSTARD12, and VvSTARD13) belonged to group 3, and such genes were related to salt stress. The expression pro les indicated that most VvSTARD genes were highly expressed at different times of NaCl, PEG and low-temperature (5 °C) treatments. Genes belonging to groups 5 (VvSTARD15, VvSTARD16, VvSTARD19, and VvSTARD23), 4 (VvSTARD20 and VvSTARD22) and 3 (VvSTARD9, VvSTARD10, VvSTARD11, and VvSTARD13) were related to drought stress. VvSTARD genes related to lowtemperature stress were distributed in different groups, and two genes were found in groups 1 and 2 (VvSTARD6 and VvSTARD8).
The expression patterns of various tissues and organs of the AtSTARD gene family demonstrated that the expression of genes in different subfamilies had similarities ( Fig. S4A and Table S8-4). Most STARD genes in group 1, such as AtSTARD15, AtSTARD10, AtSTARD1, AtSTARD6, and AtSTARD9, were expressed in Arabidopsis seeds. Two STARD genes (AtSTARD5 and AtSTARD19) belonged to group 2, and such genes were expressed in Arabidopsis seeds. Most STARD genes in group 3, such as AtSTARD17, AtSTARD18, AtSTARD19, AtSTARD20, and AtSTARD21, were not expressed in the Arabidopsis pollen but normally expressed in other tissues and organs. Two STARD genes in group 5 (AtSTARD24 and AtSTARD25) were expressed in all Arabidopsis organs and tissues. AtSTARD22 belonging to group 5 was expressed in all tissues and organs except in seeds. AtSTARD26 belonged to group 5, and it was expressed only in the roots and stamens. Most of the STARD genes in group 4, such as AtSTARD28 and AtSTARD30, were not expressed in the Arabidopsis pollen, seed, shoot and root but normally expressed in other tissues. AtSTARD27 and AtSTARD30 were not expressed in the shoot, and AtSTARD27 was not expressed in the root. Only AtSTARD31 could be expressed in various tissues and organs.
The results of abiotic stress expression analysis demonstrated that the AtSTARD genes clustered in the same group had similar resistance and different expression patterns (Fig. S4B and Table S8 -5). In group 4, one gene (AtSTARD28) was highly expressed in the shoot and root under control, cold, salt, drought, wound, and heat stresses. Group 3 had three genes (AtSTARD18, AtSTARD19, and AtSTARD21) under the control, cold, salt, drought, wound, and heat stresses that were expressed higher in the root than in the shoot. In addition, under the control, cold, salt, drought, wound, and heat stresses, some genes showed a higher expression level in root than in shoot, with one gene belonging to group 5 (AtSTARD25) and another gene belonging to group 4 (AtSTARD31). Moreover, under the control, cold, salt, drought, wound, and heat stresses, the expression level in the shoot was higher than that in the root, and the genes were distributed in groups 1 (AtSTARD10 and AtSTARD12) and 4 (AtSTARD27, AtSTARD29, and AtSTARD30).
The expression patterns of the OsSTARD gene family in various tissues and organs showed that the expression of genes in different subfamilies had similarities ( Fig. S4C and Table S8-6). Most of the STARD genes in groups 1 and 2, such as OsSTARD5, OsSTARD9, OsSTARD10, OsSTARD1, OsSTARD11, and OsSTARD6, were expressed in rice seeds, shoot apical meristem (SAM) and in orescence. Some OsSTARD genes (OsSTARD15 and OsSTARD13) were placed in group 3 and expressed in SAM, in orescence and seedling root. Furthermore, OsSTARD14 and OsSTARD12 were expressed in SAM and in orescence. Group 4 only contained one gene, that is, OsSTARD21, which was expressed in mature leaves, in orescence P2, and seeds S2-S5. Group 5 contained three OsSTARD genes, namely, OsSTARD18, OsSTARD19, and OsSTARD20.OsSTARD19 was highly expressed in in orescence P6 and seed S5. OsSTARD20 was highly expressed in SAM and young in orescence. OsSTARD18 was highly expressed in mature and young leaves.
The analysis of rice abiotic stress expression data demonstrated that 17 genes were expressed in the normal growing shoot and root and evenly distributed in ve subgroups (Fig. S4D and Table S8-7). Nine genes belonged to groups 1 (OsSTARD5, OsSTARD10, OsSTARD4, and OsSTARD2), 3 (OsSTARD16, OsSTARD13, and OsSTARD14), and 5 (OsSTARD19 and OsSTARD18), and such genes were highly expressed in the root and shoot under salt stress and evenly distributed amongst four subgroups. Groups 2, 1, 3, and 5 with 1 (OsSTARD24), 1 (OsSTARD7), 1 (OsSTARD2), 2 (OsSTARD12 and OsSTARD15), and 1 (OsSTARD20) genes were expressed in the root and shoot under cold stress and evenly distributed in six subgroups.
qRT-PCR of the VvSTARD gene family qRT-PCR was utilized to determine the cis-acting elements and the expression pro le data and further verify the physiological characteristics of the VvSTARD gene family. The results showed that most of the VvSTARD gene families could be expressed in grape leaves in response to hormones and abiotic stresses. The expression levels of different hormones and abiotic stresses at 24 h were more evident than those at 12 h ( Fig. 6 and Table Table S8-8). A considerable degree of agreement was found amongst the predicted results. As shown in the chip expression pro le, the VvSTARD gene family was expressed in grape leaves (Fig. 6), which could respond to the exogenous hormone treatment and presented a high expression level. The expression levels of MeJA, SA, IAA, and GA3 were the same as those of VvSTARD1-VvSTARD4, VvSTARD14-VvSTARD15, VvSTARD7-VvSTARD10, VvSTARD16-VvSTARD21, VvSTARD10, VvSTARD13, and VvSTARD23. After 24 h in 400 mmol l −1 NaCl treatments, 17 genes (VvSTARD1-VvSTARD15, VvSTARD17, and VvSTARD19) were upregulated with expression levels of hundreds or even tens of thousands more than those of the control. The genes of this family could severely respond to high-salt stress conditions. For instance, the expression levels of VvSTARD5 and VvSTARD8 were higher by 880-and 675-fold, respectively, than those of the control after 24 h salt stress treatment (400 mmol l −1 ).

Subcellular localization and identi cation of the heterologous expression of STARD5
A fusion protein of STARD5 and EGFP was introduced into Arabidopsis protoplasts to determine the subcellular localization of STARD5. Confocal microscopy revealed that the 35S:STARD5:EGFP uorescence signal was localized to the nucleus (Figs. 7A-7B In high-salt environments, the most signi cant (p<0.01) physiological responses of plants were the inhibition of leaf growth and the reduction of organic matter accumulation. The cell membrane was an important part of plants, which suffered from salt damage in a high-salt environment. Relative conductivity could re ect the severity of the cell membrane injury and the membrane permeability. Thus, the relative conductivity was often used to identify the salt tolerance of plants. In addition, the changes in the malondialdehyde and proline contents were the main physiological indices to determine plants under salt stress. The relative electrical conductivity and malondialdehyde and proline contents of WT and transgenic tomatoes after salt stress were measured. Results showed that the relative electrical conductivity of transgenic tomatoes was signi cantly (p<0.01) lower than that of WT tomato, and the contents of MDA and proline in transgenic tomatoes were signi cantly (p<0.01) lower and signi cantly (p<0.01) higher, respectively, than those in WT tomato. These results showed that the heterologous overexpression of STARD5 could signi cantly enhance the salt tolerance of tomatoes plants.
The relative electrical conductivity and malondialdehyde and proline contents in transgenic tomato leaves showed that the tomato plants with the heterologous overexpression of STARD5 displayed evident resistance to salt stress compared with the WT tomato after 24 h salt stress (Figs. 7F-7H and  Table S8-9). These results suggested that the STARD gene might exercise certain functions in the nucleus to regulate the changes in plant hormones and improve plant salt tolerance. Previous studies have observed that 35 members and 25 members in Arabidopsis and rice (Schrick et al. 2004). In the current study, 23 VvSTARD genes are found from the grape genome database, and these genes are less than those of the AtSTARD and the OsSTARD families. The number of STARD genes does not correlated with the genome size of the plant species, which may partly result from tandem and segmental duplication events in grapes. On the basis of previous studies, the members of the Arabidopsis, rice and grape STARD gene families, namely AtSTARD1-AtSTARD35, OsSTARD1-OsSTARD25, and For example, the expression of VvSTARD15 is 20-fold higher than that of the control when the plant is exposed to low-temperature stress, whereas VvSTARD14 is not tolerant to low-temperature stress. Collinearity analysis of the VvSTARD gene family reveals four pairs of tandem repeat genes distributed in a common subfamily probably because certain fragments of the gene have been copied, exchanged, inverted, and changed during evolution and other events (Shen et al. 2014;Li et al. 2017a). In addition, the collinearity analysis of grapes and Arabidopsis shows that 14 pairs of tandem repeat genes are distributed in the same subfamily, and only one pair of genes (VvSTARD12/AtSTARD33) does not belong to the same group, VvSTARD12 belongs to group 3, and the AtSTARD33 belongs to group 2. Collinearity analysis of grapes and rice has revealed nine pairs of tandem repeat genes distributed in the same subgroup ( Fig. 3B and Table S6). The Ka/Ks analysis suggests that the evolution of the grapes, Arabidopsis and rice STARD gene families is primarily a puri cation choice (Yang, 2007;Wang et al. 2018).

Discussion
Previous studies show that the proteins containing the START domain include the HD-Zip III, HD-Zip IV, PH-START and the START subfamilies. However, studies on these proteins under abiotic stress are relatively few. The members of the STARD gene family are analyzed using the evolution and the tertiary structure analyses. The analyses of transgenic Arabidopsis plants carrying the gene-speci c promoter fused to the bacterial β-glucuronidase reporter gene have revealed that some of the promoters have high activities in the epidermal layer of SAM and developing shoot organs, whereas others are temporarily active during the development of the reproductive organ (Nakamura et al. 2006;Khosla et al. 2014). However, the main functions of STARD genes in plants remain unclear. The HD-Zip genes of subfamilies III and IV encode an additional conserved domain called the START domain (Ponting et al. 1999), which has a putative function in sterol binding (Schrick et al. 2004).
In this study, the members of groups 1-3 belong to the HD-Zip genes of subfamilies III and IV (Li et al. 2017b), according to the accession numbers of grapes, such as group 1 members HDZ8 (GSVIVT01035612001), HDZ19 (GSVIVT01012643001). HDZ20 (GSVIVT01030605001) and HDZ26 (GSVIVT01027508001); group 2 members HDZ6 (GSVIVT01013073001), HDZ10 (GSVIVT01035238001), HDZ16 (GSVIVT01017073001), and HDZ31 (GSVIVT01029396001); and group 3 members HDZ11 (GSVIVT01025193001), HDZ15 (GSVIVT010170701001), HDZ18 (GSVIVT01021625001), HDZ21 (GSVIVT01016272001) and HDZ29 (GSVIVT01010600001). This result suggests that the HD-Zip IV has a potential role in the defense environment, and HD-Zip IV is in uenced by ethylene (Li et al. 2017b). The members of HD-Zip I and HD-Zip are reported to be related to salt stresses in Eucalyptus (Zhang et al. 2020). EgHD-Zip27 from the HD-Zip II subfamily and EgHD-Zip37 from the HD-Zip I subfamily play an essential role in coping with salt stress (Zhang et al. 2020), but the members of HD-Zip III and HD-Zip IV with salt stress are not mentioned.
In the present study, VvSTARD5 (HDZ20) from the HD-Zip IV subfamily plays signi cant roles in salt stress. In addition, the present study has described the functional characterization of the PH-START protein AtAPO1 (A. thaliana APOSTART1), indicating that the AtAPO1 is involved in the control of seed germination (Resentini et al. 2014), whereas plants withstand drought and low-temperature conditions. However, in the present study, the expression of PH-START proteins VvSTARD14 and VvSTARD15 are upregulated when exposed to salt and cold conditions, and HD-START proteins can also exhibit high expression levels under high-salt stress conditions. For instance, the HD-Zip IV subfamily member VvSTARD5 has high expression level under salt stress (Chen et al.2014). Moreover, members with only one START domain have low or even no expression under high-salt stress conditions (Fig. 6). The relative electrolyte leakage serves as an indicator for the damage caused by salt stress (Cao et al. 2007), and the proline and the malondialdehyde contents can change under the salt stress in plants (Fedina et al. 2002). Therefore, the relative electrolyte leakage and proline and malondialdehyde contents are determined from transgenic tomato leaves, indicating that the salinity causes little cell membrane damage in the leaves of STARD5 plants and corroborating STARD5's role in the positive regulation of salinity responses. The data from the present study strongly indicate the important functions of VvSTARD genes in response to salt stress.

Conclusion
In this study, 23 STARD genes are identi ed in grapes. Subsequently, these genes are divided into ve subgroups and disseminated broadly on 12 chromosomes of grape genomes. Dramatic differences in the function of this family of genes are predicted amongst different species through phylogenetic analysis, tandem repeat gene analysis and the expression data prediction with Arabidopsis and rice STARD genes. The qRT-PCR of the grape STARD gene family indicates that most genes show high expression level in response to 24 h salt stress. The results of STARD5 subcellular localization is veri ed, and the relative expression of STARD5 is detected. These ndings provide insight into the potential function of VvSTARD genes. Therefore, comprehensive analysis is important to screen STARD genes for further functional identi cation and the genetic improvement of agronomic traits of grapes. Phylogenetic relationships, gene structure and architecture of conserved protein motifs in VvSTARD proteins. A Phylogenetic analysis of STARD proteins in Arabidopsis (At), rice (Os), and grapes (Vv). represents grapes; represents rice; represents Arabidopsis. The background color of the genes in the same group are displayed with the same color. B Exon-intron structure of VvSTARD genes. Blue boxes indicate untranslated 5′-and 3′-regions; pink boxes indicate exons; green lines indicate introns. C Motif composition of VvSTARD proteins. The motifs, numbers 1-20, are displayed in different colored boxes.

Figure 2
Synonymous codon preference and correlation analysis of VvSTARD, AtSTARD and OsSTARD genes. Correlation analysis using the Pearson method. Blue represents positive correlation; red represents negative correlation, and white represents no correlation. The larger the circle is, the darker the color is, the stronger the correlation is, and vice versa, the weaker the correlation is.  Ka/Ks analysis of STARD genes among grapes, Arabidopsis, and rice. BerrySkin-PHWIII, berry skin post-harvest withering III (3rd month); Stem-G, green stem; Stem-W, woody stem; Tendril-Y, young tendril (pool of tendrils from shoot of 7 leaves); Tendril-WD, well developed tendril (pool of tendrils from shoot of 12 leaves); Tendril-FS, mature tendril (pool of tendrls at fruit set).

Figure 6
Expression levels of VvSTARD in grape leaves after 12 h and 24 h under different treatments: 0·2 mmol l−1 ABA, 150 μmol l−1 MeJA, SA, 50 mg l−1 SA, 100 μmol l−1 IAA, 50 mg l−1 gibberellin 3 (GA3), 10% PEG6000, 400 mmol l−1 NaCl, 4℃ low temperature, and control. The red axis on the left represents 12 h treatment, and the blue axis on the right represents 24 h treatment. Specimens are analyzed through real-time PCR. GAPDH (CB973647) is identi ed as an internal reference gene. Gene expression is normalized to the control unstressed expression level, which is assigned to a value of 1. Values represent the average of three independent experiments ± SD. Standard errors are shown as bars above the columns. a, b, c, d, e, and f denote a signi cant difference at the level of p < 0.05.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.