Genome-wide analysis and expression profiles of the StR2R3-MYB transcription factor superfamily in Solanum Tuberosum


 Background: MYB transcription factors comprise one of the largest families in plant kingdom, which play a variety of functions in plant developmental processes and defence responses. However, it has not been systematically studied in Potato (Solanum tuberosum), which is the most important non-cereal crop world-wide.
Results: In the present study, a total of 108 StR2R3-MYB transcription factors were identified and further phylogenetically classified into 28 subfamilies, as supported by highly conserved gene structures and motifs. Collinearity analysis showed that the segmental duplication events played a crucial role in the expansion of StR2R3-MYB gene family. Synteny analysis indicated that 37 and 13 StR2R3-MYB genes were orthologous to Arabidopsis and wheat, respectively, and these gene pairs have evolved under strong purifying selection. RNA-seq data from different tissues and abiotic stresses revealed tissue-preferential and abiotic stress-responsive StR2R3-MYB genes. We further analyzed StR2R3-MYB genes might be involved in anthocyanin biosynthesis and drought stress by using RNA-seq data of pigmented tetraploid potato cultivars and drought-sensitive and -tolerant tetraploid potato cultivars under drought stress, respectively. Moreover, EAR motifs were found in 21 StR2R3-MYB proteins and 446 pairs of proteins were predicted to interact with 21 EAR motif-containing StR2R3-MYB proteins by constructing the interaction network with medium confidence (0.4). Additionally, Gene Ontology (GO) analysis of the 21 EAR motif-containing StR2R3-MYB proteins was performed to further investigate their functions.
Conclusions: In this work, we systematically identified StR2R3-MYB genes by analyzing the potato genome sequence using a set of bioinformatics approaches. Genome-wide comparative analysis of StR2R3-MYB genes and their expression analysis identified members of this superfamily may be involved in tissue-specific development, anthocyanin biosynthesis and abiotic stress responses.

3 Background Transcription factors are important regulators of gene expression by involving chromatin organization, DNA methylation, dimerization and sequence-specific DNA binding to control many aspects of plant for responding to abiotic and biotic stresses and modulate developmental and metabolic processes through activating or suppressing of target genes [1,2]. Transcription factors can be classified into many different families on the basis of DNA-binding domain in the regulatory regions of downstream target genes [3].
The MYB transcription factors (TFs) comprise one of the largest TF families in the plant kingdom, are distinguished by the highly conserved MYB domain [4,5]. MYB family members are divided into four subfamilies depending on the number of repeat (s) in the MYB domain, which consists of 1-4 imperfect tandem repeats in N-terminus, including 1R-MYB, R2R3-MYB, 3R(R1R2R3)-MYB and 4R-MYB with one, two, three and four MYB repeats [5,6], of which, 1R-MYBs were referred to as MYB-related proteins [7]. Generally, the R2R3-MYB members are the predominant form found in higher plants [2]. The MYB repeat is approximately 50-53 amino acid residues in length, containing three regularly spaced tryptophan (or aliphatic) residues that together form a helix-turn-helix (HTH) fold [8,9].
The second and third α-helix of each repeat interacts with the major DNA groove at the specific recognition site C/TAACG/TG during transcription [10,11]. On the contrary, the Cterminal modulator region is highly divergent that is responsible for the regulatory activity of the MYB proteins [12].
The MYB TF family is present in all eukaryotes. The first gene described as encoding a MYB domain-containing protein was "Oncogene" v-MYB identified in avian myeloblastosis virus (AMV) [13], subsequently, human c-MYB proto-oncoprotein and two vertebrate MYB TFs A-MYB and B-MYB were found and proved to take part in the regulation of cell differentiation, proliferation, and apoptosis [14,15]. In plants, Zea mays C1, a homolog of and cold temperatures, and response to drought stress [39][40][41], but also have potential health benefits such as protection against some cancers and neuronal and cardiovascular diseases in humans [42].
Potato is very sensitive to drought with yield reduction beginning under moderate deficit of soil moisture [43,44], genetic and molecular biological approaches were used to overcome potato yield loss under drought stress [36, [45][46][47][48], however, there is still little known about drought stress tolerance mechanisms in potato plants.
Since R2R3-MYB TFs play an important role in wider plant biological processes, and were studied in a numerous plant species, but there are only very limited reports on the functional characterization of R2R3-MYB TFs in potato [32,33,36,46,49,50]. Thus it is meaningful to uncover the R2R3-MYB TFs' function, evolution and expression profiles in potato based on the published sequence data [51].
In the current study, we totally identified an expanded StR2R3-MYB family with 108 members, and comprehensive analysis of the phylogenetic relationships, sequence features, gene duplications, chromosome distribution, motif recognition were further investigated. In addition, we performed a comprehensive expression analysis of StR2R3-MYB genes in different tissues and under abiotic stresses in doubled monoploid (DM) potato, as well as in white and pigmented potato cultivars and in drought-sensitive andtolerant cultivars of tetraploid potato using RNA-seq data to identify StR2R3-MYB genes closely associated with spatial distribution, anthocyanin biosynthesis and multiple stresses. Moreover, 21 StR2R3-MYB genes containing EAR motif were obtained and an interaction network between these EAR motif-containing R2R3-MYBs with other potato genes were constructed. Additionally, Gene Ontology (GO) analysis of the 21 EAR motifcontaining StR2R3-MYB proteins was performed to investigate their functions. The findings should inform the characterization of StR2R3-MYB superfamily and provide valuable information for further functional elucidation of these genes in potato.

Sequence analysis and structural characterization of StR2R3-MYB genes
The exon-intron organizations of the StR2R3-MYB genes, including intron distribution patterns, phases and intro-exon boundaries, were graphically displayed by the Gene Structure Display Server (GSDS2.0 (http://gsds.cbi.pku.edu.cn/) [56] using the CDS and genome sequence of StR2R3-MYB genes. The conserved motifs of StR2R3-MYB TFs was predicted by using the MEME Suite web server (http://meme-suite.org/tools/meme) analysis [57], with the following parameters: the maximum number of motifs was set to identify 20 motifs and optimum width of motifs was set from six to 100 amino acids.
Analyses of chromosome distribution, gene duplication and synteny for StR2R3-MYB genes The chromosome distribution information of the StR2R3-MYB genes were obtained from the database of potato genome downloaded from PGSC. MapChart software [58] was used for graphical presentation of StR2R3-MYB gene's chromosomal location. Tandem duplication and segmental duplication between potatogenes as well as the synteny block of the orthologous R2R3-MYB genes between potato and Arabidopsis, wheat was obtained by using Multiple Collinearity Scan toolkit (MCscanX) (http://chibba.pgml.uga.edu/mcscan2/) [59] and visualized using the circos v0.69 [60]. To further estimate duplication events of StMYB genes, the synonymous (Ks) and nonsynonymous (Ka) were calculated by using KaKs_Calculator 2.0 (https://sourceforge.net/projects/kakscalculator2/) [61].

Phylogenetic analysis and classification of potato StR2R3-MYB proteins
The full-length amino acid sequences of StR2R3-MYB and AtR2R3-MYB proteins derived from PGSC and Ensembl Plants database (http://plants.ensembl.org/index. html) were used for phylogenetic analysis. Multiple sequence alignments of these R2R3-MYB proteins were performed using Clustal X with default parameters. An unrooted neighbor-joining (NJ) phylogenetic tree was constructed using MEGA 7.0 software [62], with the following parameters: Poisson model; pairwise deletion; and 1000 bootstrap replications. The potato R2R3-MYB proteins were classified into different groups according to the topology of phylogenetic tree.

Plant materials and treatments
A purple potato cultivar 'Heimeiren' (HM, purple skin and purple flesh), a white potato cultivar 'Xindaping' (XD, white skin and white flesh) and a red potato cultivar 'Lingtianhongmei' (LT, red skin and red flesh) were grown in a greenhouse at Gansu Agricultural University, China. HM and XD are local cultivars in Gansu province, LT was cultivated by Potato Research Center of Inner Mongolia Academy of Agricultural and Animal Husbandry Sciences and Inner Mongolia Lingtian Biotechnology Co., Ltd., China. Six fresh tubers (diameter 4-5cm) were harvested, and cleaned with sterilized water. Skin tissue was carefully separated from cortex tissue using a scalpel to minimize flesh contamination, flesh tissue was isolated with at least 5 mm distance from the skin to eliminate skin contamination. The skin and flesh of these potatoes were then immediately frozen in liquid nitrogen and stored in a -80 ºC freezer for later use.
For drought stress, a drought-sensitive cultivar 'Atlantic' (A) and a drought-tolerant cultivar 'Qingshu No.9' (Q) were field-grown with rainproof shed at Dingxi Academy of  cDNA synthesis were carried out using oligo dT according to the manufacturer's instructions (SuperScript III, Invitrogen, USA). qRT-PCR was conducted on CFX96 Touch TM Real-Time PCR Detection System (Bio-Rad, CA, USA) using SYBR ® Premix Ex Taq™ II (Takara Bio, Inc., Japan). qPCR conditions were as follows: 30s at 95 °C, followed by 40 cycles of 5 s at 95 °C, 30 s at 60 °C, followed by 65-95 °C melting curve detection. The qPCR efficiency of each gene was obtained by analysing the standard curve of a cDNA serial dilution. StEF-1α (AB061263) [63] was used for template normalization. Relative abundance was calculated with comparative Ct (2 -ΔΔ Ct) method. The primers are listed in Additional files 1: Table S1.

RNA-Seq data analysis
Total RNA of the aforementioned samples with three biological replicates was chosen for further RNA-seq library construction. Next-Generation Illumina sequencing were performed by Biomarker Technologies Corporation (Beijing, China). High quality clean reads were obtained by trimming the raw reads filtering out contaminants, adapters, phred scores less than 20 and uncertain bases N. The cleaned data were aligned to PGSC_DM_v3.4 gene models downloaded from PGSC by Bowtie2 (v2.2.9). The number of mapped clean reads were counted (by fragment) and subjected to differential expression analysis using the edgeR package (http://www.r-project.org/). Genes with the absolute value of log2FC (fold change) not less than 1 and a false discovery rate (FDR) <0.05 was considered as The Illumina RNA-seq data were also downloaded from the PGSC to study the expression patterns of MYB genes in various tissues and stress treatments. TBtools software [64] was used to generate the heatmap.

Identification of EAR motif-containing StR2R3-MYB proteins
The StR2R3-MYB proteins were manually searched to identify candidate genes containing motifs DLNxxP or LxLxL [65]. Specific protein interactions were constructed by using online STRING v11.0 software (Search Tool for the Retrieval of Interacting Genes/Proteins, http://string-db.org/) [66] with combined score 400 (medium confidence). The interaction network was visualized by Cytoscape v3.7.1 [67]. The GO enrichment analysis was conducted by using the topGO package in R project.

Availability of supporting data
The raw data of the transcriptome analysis used in this study was submitted to the Sequence Read Archive (SRA) at NCBI under Project ID PRJNA528685 and PRJNA529980, and the expression data was also available Potato Genome Sequencing Consortium (PGSC, http://solanaceae.plantbiology.msu.edu/pgsc_download.shtml). The accession number and the website listed above were publicly available. The databases used in this study were publicly accessible and no special permissions were required.

Structure of StR2R3-MYB genes and conserved motifs
Since the analyses of gene structural diversity might be helpful for understanding the gene functions and evolution, the structural diversity of StR2R3-MYB genes was also investigated (Fig. 2B). Three members in A1 contained more exons than other subfamilies with 9-11 exons, four StR2R3-MYB genes (PG0007994, PG0024983, PG0003316 and PG0034577, which belonged to SG22) only have one exon, furthermore, the results showed that exon/intron structures were highly conserved in the same subfamilies of StR2R3-MYB genes, suggesting these conserved features play crucial roles for groupspecific functions (Fig. 2B).
We used online program MEME to search for conserved motifs shared by these StR2R3-MYB proteins to further study the diversification of these StR2R3-MYB genes in potato. In total, 20 conserved motifs were identified and designate as motif 1 to 20 (  Table S4).
Three regularly spaced tryptophan (Trp., W) residues were contained in typical MYB proteins, which play significant roles by interacting with specific DNA sequences [10]. For 106 out of the 108 potato R2R3-MYB proteins, the R2 repeat sequences contained three Trp residues (Fig. 3), whereas two R2R3-MYB TFs (PG0026758 and PG0002828) only contained last two Trp residues, the first Trp was replaced by phenylalanine (Phe., F) ( Fig.   3 and Additional files 4: Table S4).
In R2 repeat, insertion of a glycine (Gly., G) residue was observed between Gly-24 and asparagine (Asn., N)-25 in five R2R3-MYB genes, an aspartic acid (Asp., D) residue in PG0002828 and PG0026758, an Asn residue and a threonine (Thr., T) residue insertion in PG0018750 and PG0035400 at same position were also observed (Additional files 4: Table   S4). In addition, the insertion of the leucine (Leu., L) residue between the second and third helices of R2 repeat was observed in 90 StR2R3MYB proteins, which was an important step for the origin for plant-specific R2R3 MYB proteins [68], whereas a proline (Pro., P) residue was inserted at same position in PG0015087, furthermore, a 22-residue insertion (SFLFLLDLYSQSEFRARLIVWL) was observed between the second and third helices of R2 repeat in PG0001325.
In the R3 repeats, the first Trp residue was generally replaced by Phe or isoleucine (Ile., I) ( Fig. 3). The second Trp residue was conserved in all the members, the third Trp residue was conserved in most of the members, except five and four R2R3-MYB members whose third Trp residue was replaced by Phe and tyrosine (Tyr., Y), respectively. Only one valine (Val., V) residue insertion was observed between Gly-4 and Lys-5 in PG0025720 (Additional files 4: Table S4).
In addition to the highly conserved Trp residues, glutamic acid (Glu., E)-10, Asp-11 in the R2 repeat, Glu-10, Gly-22, arginine (Arg., R)-35, Asn-38, lysine (Lys., K)-41, Asn-42 in the R3 repeat were also complete conserved, Leu-50 conserved in the linker region of 107 StR2R3-MYB TFs, except the Leu was replaced by Arg in PG0034577. The result showed that R3 repeat was more conserved than the R2 repeat, furthermore, the third helix of each repeat in the MYB domain was the most conserved and the first part of the HTH domain in each repeat was less conserved among the 108 StR2R3-MYB proteins.
Chromosomal location and gene duplication of potato StR2R3-MYB genes Genome chromosomal location analyses revealed that potato R2R3-MYB genes were distributed throughout all 12 chromosomes, but the distribution appeared to be uneven ( Fig. 4). A total of 105 R2R3MYB genes were mapped on 12 chromosomes, whereas three StR2R3-MYBs (Fig. 4). The pattern is similar with the genome chromosomal location of SlMYB [18].
Gene duplication has long been recognized as the origin of multigene families, and has been proved to be a prominent feature of plant genome evolution. To investigate the gene duplication events in potato, tandem duplications and segmental duplications were identified by BLASTP and MCScanX method. Among the StR2R3-MYB genes, 6 pairs of tandemly duplicated genes were identified, of which, PG0017525 was tandemly duplicated with a MYB-related gene PG0017526 ( Fig. 4 and Additional files 5: Table S5). Meanwhile, 29 segmental (26.9%) duplication pairs were found between StR2R3-MYB genes ( Fig. 5 and Additional files 5: Table S5) with two exception (PG0027157 and PG0027575) were duplicated with MYB-related genes PG0004610 and PG0008340, respectively. Most segmentally duplicated MYBs are located in collinear regions on chr2, 3, 5 and 6. (Fig. 5), and most of these segmentally duplicated genes were belonged to S1, 2, 9, 11, 18 and 20 (Additional files 5: Table S5) , suggesting the rapid expansion of R2R3-MYB genes in these function-specific groups in potato might be strongly linked to adaptation strategies in response to challenging environments.
To further explore the potential evolutionary processes of StR2R3-MYB gene family, two comparative syntenies of potato with Arabidopsis thaliana and Triticum aestivum (wheat), which were belonged to dicotyledon and monocotyledon respectively ( Fig. 6A and 6B) were constructed. The results showed that 37 orthologs between potato and Arabidopsis, 13 orthologs between potato and wheat were identified, respectively (Additional files 6: Table S6).
The substitution rate (Ka/Ks) was an effective index to determine the positive selection pressure after duplication, Ka/Ks<1 means purifying selection, Ka/Ks = 1 stands for neutral selection, while Ka/Ks>1 signifies positive selection [69]. Thus, the Ka, Ks and Ka/Ks of each gene pair were calculated and results showed that all the segmentally and tandemly duplicated MYB gene pairs had Ka/Ks values of less than 1, except one pair PG0005918 and PG0022689 with Ka/Ks =1.01, implying that most of these genes had evolved under the effect of purifying selection (Additional files 5: Table S5 and Additional   files 6: Table S6), the average Ka/Ks value of tandem duplication genes (0.31) is a little lower than that of segmental duplication genes (0.36) (Fig. 6E), and the Ka/Ks values of gene pairs between potato and Arabidopsis, potato and wheat orthologs were 0.3315 and 0.3722, respectively (Additional files 6: Table S6).

Expression profiles of StR2R3-MYB genes in different tissues
To understand the tissue-specific expression patterns of the StR2R3-MYB genes, the transcript abundance of 13 different tissues (leaves, roots, shoots, callus, tubers, sepals, stamens, stolons, flowers, petioles, petals, carpels and fruit) of DM potato was analyzed by using transcriptome data downloaded from the PGSC. The results showed that a total of 7.4% StR2R3-MYB genes (8/108) were highly expressed in all tissues (Additional files 7: AtMYB15 (S2), AtMYB38 (S14), AtMYB36 (S14), AtMYB20 and AtMYB37 (S14), were only highly expressed in white skin. The PG0013965 showed higher expression in pigmented skin than in white skin, while the FPKM was less than 5, whereas the expression of PG0013966 was higher in white skin with an FPKM = 5, the PG0019217 showed no expression ( Fig. 7A and Additional files 9: Table S8).
In tuber flesh, 38 StR2R3-MYB genes were not expressed and the expression of 39 StR2R3-MYBs was lower than 1. The PG0013965 was highly expressed in purple and red flesh, while the PG0013966 and PG0019217 showed lower expression in tuber flesh. In addition to PG0013965, six genes, which are homologous to AtMYB48, AtMYB3(S4), AtMYB36 (S14) and AtMYB79, showed higher expression in pigmented flesh, while two genes, homologous to AtMYB27 and AtMYB15 (S2), showed higher expression in white flesh ( Fig. 7B and Additional files 9: Table S8).
Furthermore, an interaction network of StR2R3MYB proteins, which showed expression with an FPKM value more than 1 in skin and flesh, was built using STRING software. The results showed that eight StR2R3-MYB TFs can directly interact with StAN1 with combined score > 400, of which, two MYBs (PG0007325 and PG0018113) were homologous to AtMYB67, two MYBs (PG0024822 and PG0018427) belonged to S21, two MYBs (PG0003316 and PG0024983) belonged to S22, one MYB (PG0027190) belonged to S23 and one MYB (PG0011243) was homologous to AtMYB72. Interestingly, the two MYBs (PG003316 and PG0024983), which belonged to S22, were also interacted with StMYBA1 (Fig. 7C), suggesting the two MYBs might play important roles in anthocyanin biosynthesis. Of these MYB TFs, PG0007325 showed higher expression in pigmented skin, PG0018113 showed higher expression in purple flesh and pigmented skin, PG0024822 and PG0018427 in S21 were higher expressed in white skin, PG0003316 and PG0024983 in S22 were all highly expressed in all tissues, PG0027190 was up-regulated in pigmented flesh and PG0011243 was higher expressed in pigmented skin.
To further investigate the StR2R3-MYB genes might be in response to drought stress, one tetraploid drought-sensitive cultivar "Atlantic" (A) and one tetraploid drought-tolerant cultivar "Qingshu No.9" (Q) were subject to drought stress. The RNA-seq data showed that 42 StR2R3-MYB genes were up-or down-regulated in response to drought stress, of which, 15 genes were highly expressed in Q with an FPKM > 5 and |log2(FC)| > 1 at flower-falling stage ( Fig. 8 and Additional files 12: Table S10). There are not much differences of these genes in A and Q at early flowering stage, then these genes were up-regulated in Q at fullblooming stage and highly up-regulated in Q at flower-falling stage under drought stress, while four genes were down-regulated in Q beginning from early flowering stage until flower-falling stage, these genes might be involved in drought stress, which is worth further investigating.
We also analyzed expression patterns of 12 selected StR2R3-MYB genes from different subfamilies by quantitative real-time PCR (qPCR), which showed relatively higher expression by RNA-seq related to anthocyanin biosynthesis and drought stress to further confirm the reliability of RNA-seq database. The results showed that the expression pattern determined by qPCR and FPKM is consistent, as shown in Figure 9, PG0013965 ( StAN1) was up-regulated in pigmented tissues, the expression in pigmented skin was lower than that in pigmented flesh, while it was not expressed in A and Q under drought stress. PG0017223 is exclusively highly expressed in white skin, whereas PG0015536 and PG0030548 were only up-regulated in pigmented flesh, PG0009033 was higher expressed in Q during three stages, PG1026177, PG2026177, PG0024983 and PG0013405 were highly expressed in Q at flower-falling stage, two genes PG0015536 and PG0030548, showed opposite expression pattern with higher expression in A at flowerfalling stage. Although the relative expression of the selected genes varied between RNA-Seq dataset and qPCR analysis, a high correlation (R 2 = 0.8394) described by a simple liner regression equation y = 0.8762x + 0.0332, suggests good consistency between the two analysis methods.

Identification of EAR motif-containing StR2R3-MYB proteins in potato
The Ethylene-responsive element binding-factor-associated amphiphilic repression (EAR) motif, defined by the consensus sequence patterns of either LxLxL or DLNxxP is the most dominant transcriptional repression motif identified in plants. In our study, 20 members of StR2R3-MYB family have been identified containing at least one LxLxL type of EAR motif, and one StR2R3-MYB protein contained DLNxxP (Table 1), these StR2R3-MYBs were belonged to S2, S4, S9, S18, S20, S22, and two StR2R3-MYBs were homologous to AtMYB41 and AtMYB48, PG0020071 (AtMYB15-like) have two LxLxL motifs. The 22 EAR motif sites were mostly found in the C-terminal region (10 out of 22) and in the N-terminal (9 out of 22), and at lower frequency in the middle (3 out of 22) regions ( Table 1).
The core EAR motif sites comprising nine amino acids were analyzed by MEME website, among the 21 LxLxLx motifs, positions 1, 2, 4 and 9 were more frequently occupied by Gln, Leu, Ile and His residues, Glu and Ser, His and Pro are more abundant in position 6 and 8, respectively (Fig. 10A). Subsequently, the protein interactions of EAR motifcontaining StR2R3-MYB TFs with other potato proteins were examined by using STRING software with combined score > 400 ( Fig. 10B and Additional files 13: Table S11). The results showed that the 21 EAR motif-containing StR2R3-MYB proteins were involved in at least nine interaction possibilities, the StR2R3-MYB proteins, belonged to the same subfamilies, appeared to have similar functions by regulating common target genes (Fig.   10).
Furthermore, GO assignments were used to predict the functions of the EAR motifcontaining StR2R3-MYB proteins by classifying them into three independent ontologies in terms of biological process (BP), molecular function (MF) and cellular components (CC). As shown in Additional files 14: Table S12, the functions of these proteins were related to biological process (11 out of 21, 52.4%) including biological regulation (8), cellular process (8), metabolic process (8), response to stimulus (11) and developmental process (5) etc.; for molecular function (21 out of 21, 100%), nucleic acid binding (21) was the most represented GO term, followed by calmodulin binding (1); for cellular component (6 out of 21, 28.6%), major categories were cell (6) and cell part (6).

Discussion
The MYB family is one of the largest transcription factors families, which have been identified to be involved in various plant physiological and biochemical processes. R2R3-MYBs are the predominant form found in higher plants, they play important roles in the primary and secondary metabolism, developmental processes and responses to biotic and abiotic stresses [5]. In the present work, we performed a genome-wide investigation of the  Fig.1 and Fig. 2). For example, A17 consists of PG0009033, PG0004371 and PG0007304, which were clustered with AtMYB11, AtMYB12, and AtMYB111 (S7), implicating in the control of flavonoid accumulation [70][71][72], A16 comprises four members clustered with AtMYB123/TT2 (S5), which controls the biosynthesis of proanthocyanidins (PAs) in the seed coat of Arabidopsis [73].

Duplication contributed to the StR2R3-MYB gene expansion
Gene duplication contributes significantly to the expansion of MYB genes in the plant kingdom, which lead to the diversification and evolution of genes. Our results showed that 6 (5.6%) and 29 (26.9%) StR2R3-MYB genes were identified as tandem duplication and segmental duplication, respectively, indicated that segmental duplication event was a major cause of expansion of StR2R3-MYB genes.
Most StR2R3-MYB genes were tandemly and segmentally duplicated within one cluster to expand their subfamilies, the duplication event was mostly occurred in the genes belong to S1, S2, S18 and S20 (Additional files 5: Table S5) The MYB TFs play essential roles in the regulation of gene expression to cope with environmental stresses [78,79]. 67 StR2R3-MYBs were differentially expressed under salt, mannitol and heat stresses, indicating that they were major factors involving in crosstalk among different signal transduction pathways in response to abiotic stresses. However, several StR2R3-MYB genes appeared to take part in respond to only one stress stimulus, suggest that there are different signaling pathways related to the response to abiotic stress treatment. In addition, some genes showed opposite expression profiles under different stresses, implying the complicated signaling transduction pathways in response to abiotic stresses. We further analyzed StR2R3-MYB TFs might be involved in drought stress based on RNA-seq data and compared with the data treated with mannitol downloaded from PGSC, the results showed that 15 genes were highly expressed in drought-tolerant cultivar Q with an FPKM > 5 and |log2(FC)| > 1 at flower-falling stage ( Fig. 8 and Additional files 12: Table S10), of which, 10 genes showed the same expression pattern in two RNA-seq datasets, while two genes (PG0015536 and PG0022689) showed opposite expression pattern, two genes (PG2026177 and PG1026177) in A22 and one gene (PG0005848) in A16 (S5) were highly expressed in Q, while showed lower expression in mannitol-treated RNA dataset. Among these 15 genes, only two genes (PG0030548 and PG0031317), all belonged to S4, were highly expressed in A, while other genes were all highly expressed in Q. It's reported that AtMYB44/AtMYBR1 (S22) regulates ABA-mediated stomatal closure in response to abiotic stresses and three other members (AtMYB70, AtMYB73 and AtMYB77) in this subgroup are likely to be associated with stress responses [80,81]. In our present work, one member PG0024983 in S22 was up-regulated under salt and mannitol stresses, and highly expressed in Q at flowering-falling stage as well, the other member PG0003316 in S22 was highly expressed under mannitol treatment, also upregulated in Q, suggesting the two StR2R3-MYB genes are likely to be associated with drought response. AtMYB96 mediated the abscisic acid (ABA) signal network that confers abiotic stress tolerance [82], in our dataset, PG0033043 and PG0019535, homologous to AtMYB96, were all up-regulated in DM potato under salt and mannitol stresses, and PG0033043 was up-regulated in Q as well. The results revealed that these identified StR2R3-MYB TFs were responded to drought stress, and was worth further investigating.
Our genome-wide analysis and expression profiles of StR2R3-MYB TFs in response to various stresses, especially drought stress, provide a foundation for their functional characterization with stress tolerance.
The interaction network of EAR motif-containing StR2R3-MYB proteins A transcription factor can act as an activator or repressor in the transcriptional regulation [83]. The EAR motif is the most dominant transcriptional repression motif identified in plants. These R2R3 MYB repressors include AtMYB3/4/6 in Arabidopsis [84], MdMYB16/17/111 in apple [85], FaMYB1 in strawberry [86] and PhMYB4 and PhMYB17 in petunia [87]. Our previous work also suggested that StMYB44-1 (PG0003316) exerted its strong repressive ability through the presence of a complete EAR motif (LxLxLx) in the Cterminal region. The sequence of this motif is less conserved in StMYB44-2, with a substitution of the final L with P (LxLxPx) which could account for its weaker repression [88]. In this work, 20 members of StR2R3-MYB family have been identified containing at least one LxLxL type of EAR motif, and one StR2R3-MYB contained DLNxxP, and a protein interaction network of EAR motif-containing StR2R3-MYB TFs with other potato proteins showed that the 21 EAR motif-containing StR2R3-MYB proteins were involved in at least nine interaction possibilities, The investigation of EAR motif-containing proteins will provide new insight into the link between complicated gene regulation system.         The expression profiles of StR2R3-MYB genes with an FPKM > 1 in white and   Protein Sequence logo of 21 EAR (LxLxL) Motifs and protein interaction network predicted for EAR motif-containing StR2R3-MYB TFs with interacting potato proteins (combined score > 400, medium confidence). The edge color was shown from yellow to purple in accordance with the combined score. PG0020071 in S2 was highlighted in dark grey; PG0006176, PG0013215 and PG0030548 in S4 were highlighted in green, PG0000027 and PG0021654 in S9 were highlighted in orange; PG0028949 and PG0013897 in S18 were highlighted in yellow;

Supplementary Files
This is a list of supplementary files associated with the primary manuscript. Click to download.