Genome-wide identi cation and expression pro ling of the PIN auxin transporter gene family in L. chinense

Background The genus Liriodendron is ancient and contains only two species, L. chinense and L. tulipifera. These two Liriodendron sister species, with a typical intercontinental discontinuous distribution in east Asia (L. chinense) and eastern North America (L. tulipifera), have great scientic value for paleobotany systematics. L. chinense is now recognized as an endangered species partially due to its low natural settling rate. In order to improve our understanding of how this species develops and grows and contribute to protecting this valuable relict species from extinction, it is necessary to explore the mechanisms underlying organ morphogenesis and embryonic development, in which auxin plays an important role. The auxin eux carrier PIN-FORMED (PIN) proteins are required for the polar transport of auxin between cells through their asymmetric distribution on the plasma membrane, thus mediating the differential distribution of auxin in plants and, nally, affecting plant growth and developmental processes.


Background
The phytohormone auxin, indole-3-acetic acid (IAA), is a positional signal molecule synthesized within the plant via several different pathways and plays an essential role in regulating various plant growth and developmental processes: embryonic apical-basal polarity establishment, apical and axillary meristem formation, fruit ripening and root architecture construction, as well as plant tropisms such as phototropism, gravitropism and hydrotropism. The transport of polar auxin is mainly performed by plasma membrane auxin transporters which achieve its asymmetric distribution in plants [1][2][3][4][5][6][7][8][9]. The in ux and e ux carriers involved in moving auxin between plant cells are mainly controlled by three gene families: the auxin permease 1 (AUX1)/LAX in ux carriers, P-glycoprotein (MDR/PGP/ABCB) e ux/conditional transporters and the PIN-FORMED(PIN) e ux carriers. PIN genes form the most prominent group of auxin carriers by directly controlling auxin polarity distribution, while having coordinated interaction with other gene families [10][11][12][13].
The characteristics of asymmetric distribution of PIN proteins in cells are particularly compatible with the chemiosmotic hypothesis [14]. Based on this hypothesis, PIN1 was proposed as an auxin e ux carrier in the A. thaliana shoot [15]. Further research identi ed a total of eight Arabidopsis PIN genes, which could be divided into two groups [1]. Five PIN proteins (PIN1-4, PIN7) with long loops in the hydrophilic domain of the protein are located at the plasma membrane in an asymmetrical distribution and play a pivotal role in cell-to-cell auxin transport [16,17]. The remaining three PIN proteins (PIN5/6/8) with short loops, are localized at the endoplasmic reticulum membrane. These three PINs are proposed to maintain intracellular regulation of auxin homeostasis by working together with the PIN-LIKE auxin e ux carriers [18][19][20][21]. Structurally, all PINs are membrane proteins. Existing studies have shown that conserved transmembrane domains at the N and C terminus, as well as a central hydrophilic loop region of variable length, in uence protein polarity localization patterns and activity of different PINs [22,23]. With the release of genome data for different species, the PIN gene family so far has been identi ed in a variety of living multicellular plants, such as rice, maize, cotton, soybeans, etc. [24][25][26][27]. Transcriptome data analysis and qRT-PCR showed that there might be functional interactions and redundancy between PIN genes, which also play a role in abiotic stress response and interaction with other plant hormones [27,28].
The ancient relic plant Liriodendron belongs to the Magnoliaceae family and occupies a critical evolutionary position. Most of the Liriodendron genus went extinct during the Pleistocene, with only two extant relict species remaining to this day: L. chinense and L. tulipifera. With the completion of the L. chinense genome sequence, revealing that magnoliids arose before the divergence of eudicots and monocots [29], a basis for further genetic research on L. chinense has been provided. Despite the extreme importance of PINs, the origins, characteristics and functions of these proteins in L. chinense are still largely unknown. In this work, we identify the 11 members of the PIN gene family in L. chinense and analyze their several properties, including physical and chemical property analysis, evolutionary analysis and gene location. In addition, through multiple sequence alignment of PIN protein sequences taken from 17 land plants, we identi ed conserved functional sites and found that LcPINs evolved when dicotyledonous plants diverged. In combination with quantitative tests and transcriptome data analysis we could show that LcPINs have different degrees of expression during Liriodendron somatic embryogenesis and organ development.
Our study provides a systematical exposition about the evolutionary relationship and structural conservation of LcPIN genes, laying a foundation for further functional research.

Results
Genome-wide identi cation of PIN proteins in L. chinense To identify L. chinense PIN genes, we used their HMMER and pfam number (PF03547) to search for PIN protein sequences in the L. chinense protein database. A local BLASTP algorithm was used with each of the eight AtPIN genes as queries. Then, the conserved domain of each candidate gene was predicted using the SMART database. We identi ed 11 LcPIN genes, with two pair of protein sequences (Lchi22082/Lchi33830 and Lchi23130/23125) having a similarity of 100%. Basic gene information, such as gene number, gene location, isoelectric point (pIs) and molecular weight (MW) for the L. chinense PIN proteins is listed in Table 1. The identi ed LcPIN encoded proteins range from 233 (Lchi23130/23125) to 662 (Lchi17800) amino acids in length, with pIs varying from 6.37 (Lchi15751) to 9.49 (Lchi05137) and MWs varying from 25.04kD (Lchi23130/23125) to 71.7kD (Lchi17800). The 11 LcPIN genes are distributed over 6 chromosomes. Chromosomes 5 and 11 each contain three LcPIN genes, while chromosomes 2, 3, 6, 7 and 17 each contain a single LcPIN gene (Fig 2A).  (Fig 1). The PIN1 subgroup has extensively expanded in our selected species (including L. chinense, yet excluding A. thaliana), suggesting that it may play an important role in the growth and development of each of these different plant species. In Arabidopsis thaliana, the PIN3/4/7 family underwent an extensive differentiation in comparison to other plant species, indicating that AtPIN3/4/7 may have undergone functional speci cation in this species. The PIN6 group could not be identi ed in monocotyledons, consistent with previous studies [25]. The PIN3 and PIN10 subfamilies are exclusive to dicots and monocots respectively and may have evolved independently in these lineages, based on previous studies [10,25]. Since previous studies found that magnoliaceae emerged before the divergence of monocotyledons and dicotyledons, this suggests that in monocotyledons the PIN3 and PIN6 families were lost, while the PIN10 family evolved independently. The PIN gene subfamilies in L. chinense most closely resemble those of A. trichopoda and A. thaliana. LcPIN protein gene structure and transmembrane topology To better understand L. chinense PIN gene structure diversity and transmembrane topology, we constructed a phylogenetic tree using the PIN gene sequences. Gene structure patterns are highly conserved in LcPIN genes, with each gene containing 3-5 introns (Fig 2). The difference in gene size between the largest gene LcPIN1a and the smallest gene LcPIN6a is mainly due to intron length. Therefore, it's possible that the diversi cation of exons/introns played an important role in the evolution of this gene family, but the exact mechanism is unclear.
Their predicted transmembrane topology showed that, the number of hydrophobic loops in LcPIN proteins has a high degree of variance. Excluding LcPIN2 and LcPIN6b, LcPIN proteins have a typical conserved structure with two highly conserved hydrophobic loops at the N and C terminus and a central hydrophilic loop within each terminus ( Fig S2).
LcPIN geneshave highly conserved motifs and evolutionary relationships within different species We then performed motif analysis using MEME/MAST, which showed highly conserved sequences (referred to as "Motifs" numbered from 1 upwards, starting at the N-terminus) present at both the protein N-and C-termini ( Fig S2). Motif1-8, 12 and 16 were found or partially found in conserved sequence regions including the two transmembrane regions. Comparisons of motif distributions revealed that the intermediately hydrophilic region of PIN proteins was variable across different subgroups. It could furthermore be derived that two different types of PIN proteins can be distinguished during PIN gene evolution. The rst group of sequences contains a short hydrophilic loop in between two conserved transmembrane regions and can be considered a "Short PIN". This group of PIN proteins is represented in Arabidopsis by PIN5 and PIN8 [30]. Our result shows that PIN6 and PIN9 belong to the "Short PINs" as well. A second group of PIN proteins contains a longer hydrophilic loop between the two transmembrane regions and can be considered a "Long PIN". This type is represented in Arabidopsis by PIN1-PIN4 and PIN7 [30]. The PIN1, PIN2, PIN3/4/7, PIN10 and some PIN genes from Bryophylla and Gymnosperm belong to the "Long PIN" in our evolutionary tree. This suggests that the "Long PIN" could have independently differentiated in Angiosperms, Gymnosperms, and Bryophytes.
As summarized in LcPIN, In the "Short PIN", the LcPIN5 only contain Motif1-8,12 and 16 withmainly distributed in the hydrophobic region at both ends of the protein. The LcPIN8 lacked motif 1,3,7,12 in the N-terminus and there is Motif11 in the middle hydrophilic loop. In the "Long PIN", the Motif numbers of LcPIN range is 17~20 and they have highly conservative in the hydrophilic and hydrophobic. According to the number of motifs in the middle hydrophilic area of PIN6 was divided into two clades, Motif 9,10 of hydrophilic loop speci ty appear in LcPIN6b. This branch number of motif was clost to "Long PIN". LcPIN6a was close to the "Short PIN" for the Motifs were distributed only in hydrophobic regions. In the whole, the increase in the number of PIN6 motifs was more like a transition from "Short PIN" to "Long PIN". Combing with the PIN6 in the Amborella trichopoda, we guessed the sequences of Motif 9/10 could be believed to the symbol of the appearance of a long PIN. Compared to the "Short PINs", the"Long PINs" possess a complex hydrophilic loop, including almost all sequence motifs save for those found in the conserved region. However, additions or deletions to individual motifs between and within subgroups could be found in the middle hydrophilic loop.
We constructed an evolutionary tree by multiple sequence alignment ( Fig S2) and found a close evolutionary relationship between all LcPINs and PIN proteins from A. trichopoda and C. kanehirae, indicating that it could be a more recent evolutionary relationship to these two subgroups. A separate analysis of each subgroup reveals an interesting phenomenon: the short PINs are divided before long PINs and the gramineae (grasses) PIN1/8 subgroups evolved into a separate branch before LcPIN1/8 emerged (Fig S2). This result suggests that the long PINs could play a speci c functional moving in particular direction by the deepening of differentiation and the PIN1/8 subgroups could exist the difference of funcition in monocotyledons and dicotyledons. However, this speci c conclusion still needs further analysis.
Functional site analysis within L. chinense PIN conserved motifs Through previous experimental veri cation and data analysis, several functional elements and sites that control PIN protein polarity, tra cking and activity have been identi ed [3,31,32]. Our multiple sequence alignments show that these elements reside for a large part within the highly conserved LcPIN sequence motifs ( Fig S2). For example, motif1 and motif5 contain two cysteine residues (C39 and C521) and occur in the transmembrane domain of all LcPIN proteins, excluding only LcPIN8 which only contains motif5 (All functional sites are labeled with LcPIN1a as a reference) (Fig 3 and Additional le 3). This motif has been implicated in regulating the endocytosis and distribution on the PM of "Long PINs" [33], while for "Short PINs" the functionality of these sites has not yet been veri ed.
The hydrophilic loop (HL) domain contains identi able motifs as well, being motifs 4, 7, 9 and 15 (Additional le 2). Previous studies have shown that these motifs in the PIN protein HL contain sites involved in regulating PIN protein membrane abundance, as well as in maintaining their polar localization in the cell [34][35][36][37]. For example, the NPXXY element (Motif 4) near the C-terminus plays an important role in AtPIN1 localization ( Motif15 contains a conserved phenylalanine residue (F165) in all "Long PIN" protein sequences of L. chinense (Fig 3). This residue has previously been found to interact with μA-and μD-adaptins in vitro and is possibly involved in PIN1 tra cking and polar localization in A. thaliana [37].
Coordinated PIN-mediated auxin transport requires activation and polarity control via phosphorylation by protein kinases [38]. So far, the phosphorylation of PIN is known to be controlled by the following three protein families: AGC kinases, PROTEIN (MAP) KINASES (MPKs), and Ca2+/calmodulin-dependent protein kinase-related kinases (CRKs) [31]. We found a small number of phosphorylation sites associated with these kinases, such as S1~S3 (motif9) and T1~T3 (motif17, not all PINs have this motif), that are highly conserved (Additional le 3). These sites inside Our multiple sequence alignment showed that these sites of signi cance are highly conserved in the motifs found in the "Long PIN" HL domain and all PIN transmembrane domains. We could only detect sequence divergence in individual angiosperm sequences. These results suggest that the conserved sites in multiple motifs may have a fundamental function in different species.
Organ-speci c expression pro le of the LcPIN family genes We used uorescence quantitative PCR to analyze the expression of LcPIN gene family members in different organs (stamen, pistil, petal, bark, bud, root, stem, leaf) of L. chinense. Since the LcPIN5a-1/5a-2 and LcPIN8a-1/8a-2 sequences are identical, it was di cult to design speci c expression primers to distinguish them. Therefore, the expression level of these genes is the combined expression of two gene copies. We found that the LcPIN gene family members are expressed in almost all L. chinense tissues (Fig 4). However, the relative expression levels of LcPIN8a-1/8a-2 and LcPIN1b are comparatively low and absent from some tissues. LcPIN3 and LcPIN6a are highly expressed in all tissues and LcPIN1c is expressed in multiple tissues except the petal and leaf. The expression patterns of LcPIN1a and LcPIN5 show high similarity, with higher expression in the bud and leaf than in other tissues. LcPIN2 and LcPIN6b showed more speci c expression patterns in a more limited number of tissues, with LcPIN2 mainly expressed in the root and stamen and LcPIN6b mainly in buds. Interestingly, we found that LcPIN3 and LcPIN6a show relatively high expression in stamens and petals. The upper part of the L. chinense petal is covered with green, and the color of middle and lower part becomes weak compared with the L. tulipifera. To further observe the changes of PIN genes in stamens and petals. We established the dynamic expression pattern in different periods of petals and stamens. In the stamens, LcPIN6a, LcPIN3, LcPIN1a and LcPIN1c was expressed (Additional le 4). The LcPIN3 had a higher level of expression than LcPIN1a/1c (Fig 5A). At the same time, its level of expression has been declining from LC-1 to LC-3 and increased in LC-4. Comparing with the LcPIN3, the changed of LcPIN1a and LcPIN1c was more stabilization (Fig 5A &Additional le 5). The LcPIN3 and LcPIN6a displayed a similar expression pro les in the same part of petals and LcPIN1a had a low expression except in the upper part of LC-4 stage of petal development (Fig 5A & Additional le 5). On the lower middle part of the petal in the LC-4 stage, the LcPIN3 and LcPIN6a expression levels rose sharply (Fig 5C &5D). Before this stage, their expression level was relatively stable in the middle of the petal. To summarize the above, the LcPIN3 could display certain in uence on the development of stamens and the LcPIN3 and LcPIN6a play a role in the development of petal.
PIN genes dynamic expression patterns during the somatic embryogenesis in Liriodendron × sinoamericanum 154102 As an auxin e ux protein, PIN plays an important role during the growth and development of many plant species [39]. Though the Liriodendron × sinoamericanum arti cial hybrid (obtained from the Chinese and North American Liriodendron subspecies) acquired immature embryos induced embryogenic callus. Based on previously generated RNA-Seq data that was obtained from successive developmental stages during somatic embryogenesis, we constructed a heat-map showing the expression of all 11 LcPINs at each stage (Globular embryo, Heart-shaped embryo, Torpedo embryo, Immature cotyledon embryo, Cotyledon embryo and Plantlet). Barely detectable or no expression was observed for LcPIN5a-1/5a-2, LcPIN8a-1/8a-2 and LcPIN6a/6b (Fig 6), which indicates that they might not express during embryogenesis or are only activated under special conditions. The other LcPIN (Long PIN) gene expression patterns were divided into two groups. LcPIN2 and LcPIN1b expression was barely detectable at the globular embryo stage, after which expression levels rose until the immature cotyledon embryo stage. The LcPIN1a, LcPIN1c and LcPIN3 expression levels remained relatively constant throughout this process except at the plantlet stage. These ndings suggested that the long LcPINs genes could play a key role at different stages of somatic embryogenesis, while Short PINs are barely expressed and less likely to be involved.

Basic information and evolutionary relationship of LcPIN gene family
In this study, we demonstrated the presence of 11 PIN genes in L. chinense, located on 6 separate chromosomes (Fig 2A). We carried out phylogenetic analyses, comparing LcPIN sequences to PINs of other plant species. The LcPIN1/5/6/8 subgroups all possess more than one sequence, with LcPIN5 and LcPIN8 having a duplicate gene with up to 100% sequence identity. A previous study found that the L. chinense genome may have undergone whole genome duplication around 116 million years ago, an event that could underlie the presence of duplicated PIN genes [29]. Liriodendron belongs to the magnoliaceae family, which is thought to have diverged at the base of the angiosperm lineage [40]. Evolutionary analysis shows that the LcPIN and AtrPIN likely had a common ancestor (Fig 1). In order to better understand the diversity and similarity of the LcPIN genes, we analyzed their intron-exon structure ( Fig 2B & Additional le 1), nding little variation overall. The main reason for gene length variation is the variation in intron length, which was also found for other species [25].
The structure domain and membership division of LcPIN protein PIN genes have been studied in detail in A. thaliana: they are mainly composed of two conserved hydrophobic loops in the N-and C-terminus and a variable hydrophobic loop in the middle [31]. By analyzing the transmembrane domain and conserved motifs in LcPIN proteins, we found that most of the transmembrane structures of LcPINs contain 5-9 transmembrane spirals except LcPIN6b. But like LcPIN2 and LcPIN6b, they only have transmembrane helices at the N-terminus, not at the C-terminus. However, the same conserved motif exists at the C-terminus of these proteins as it does in other species (Additional le 2). Through multi-species protein sequence comparison, it can be found that all LcPIN proteins have the same motif (except for LcPIN8a/b, which has only one motif at the N-terminus), and they jointly constitute a highly conserved domain. The sequence length of the hydrophobic loops determines whether a PIN gene is a "Long PIN" or a "Short PIN". However, the hydrophobic loops were found to contain four instances of highly conserved sequence (HC1-HC4) [41]. A more accurate classi cation for PIN genes was made based on the degree of HC1-HC4 sequence conservation, being: "Canonical PIN", "Noncanonical PIN". and "Semicanonical PIN" with PIN6 [31,41]. This changed classi cation is directly related to the presence or absence of certain motifs in the proteins. Analysis of PIN genes family in 17 species, the most PIN genes all had conservative motif in the N and C terminal of the sequences. The difference mainly due to the diversity of the sequences in the transmembrane conservative structure and decision. The hydrophobic loop is distributed at both ends of the protein sequences include motif 1,2,3,4,5,6,7,8,12,16. And the middle of the protein sequence contains hydrophilic loop. "Canonical PIN" hydrophilic loop in the middle there were a lot of conservative motif, Therefore the distinction between different subgroups were mainly concentrated in the hydrophilic loop. Looking at PIN6, it can been found that it gathers into two branches, which favor Canonical PIN and Noncanonical PIN respectively and can been regarded as a transition between these two kinds of PIN. This phenomenon was well represented in LcPIN6 genes.

LcPIN has some conservative functional sites
As an auxin e ux protein, PIN plays an important role in transporting intercellular auxin. PIN proteins contain a number of functional sites located in conserved regions of the protein. However, non-conserved regions may also contain functional sites, which are often species-speci c and related to species differences and sequence evolution [4, 9, 11, 16, 18, 30-32, 34, 35]. Here, we limit ourselves to the discussion of conserved functional sites concerning auxin transport, protein polarity localization and protein activity (Additional le 2 & Additional le 3 & Fig. 3). The conserved C39/C521/F165 sites in LcPINs may participate in the interaction with several μ-adaptins to control tra cking and polarity localization on the plasma membrane in Long PINs. F165 mutation resulted in accumulation of the carrier in round, condensed structures in AtPIN1 [37].
The three predicted TPR*S (motif9) sites are binding sites for different phosphatases that activate PIN protein, but in LcPIN1c, there were only two TPR*S regions (lacking T3-S3), indicating some diversity among LcPIN proteins. The di-acidic motif is composed of the tyrosine motif NPNXY followed by an SSL sequence. Alignment of the region spanning the tyrosine motif of all LcPIN proteins revealed a conserved sequence NPN(S/T) -YSSL (where S is found in the LcPIN5/6a sequence and T is present in other LcPINs) in the HL of LcPINs, with the last three SSL amino acids missing in the short-looped LcPIN5/6/8. Mutagenesis of the di-acidic motif in AtPIN1 resulted in signi cant accumulation of the protein at the endoplasmic reticulum [21], supporting its role in tra cking from the endoplasmic reticulum. It suggests that this motif may have a similar function in L. chinense.

The expression pattern of LcPIN gene was preliminarily revealed
While studying the structure of PIN proteins, corresponding functional studies have also been carried out. As an inter-cellular auxin output protein, PIN plays an essential role in the growth and development of plants [4,9,12,20,21,26,40]. There are multiple members in most subgroups of the LcPIN gene family. Each subgroup has its own pattern of expression. LcPIN1/6 subgroup expression was found across eight different tissues, but each member of this subgroup expressed in a different subset of these tissues. For example, the LcPIN1a were expression in eight tissue but the LcPIN1b almost no expression in ower. Similarly, LcPIN3 is expressed in all these same tissues, but its expression level in stamens is several times higher than in other tissues. AtPIN2 was also observed in the region where root hairs are formed [42], but LcPIN2 was highly expressed in both the root and stamen. This suggests that LcPIN2 could have evolved additional areas of expression. We found that LcPIN3 shows high and dynamic expression levels in petals and stamens. The PIN6 expression in the nectary in uences nectar enrichment in Arabidopsis thaliana [47], with LcPIN6a being expressed in the lower middle part of petals it could exert a similar function due to the presence of nectary tissue in this region [40].
Somatic embryogenesis is a special phenomenon that can occur in plants and which shows many parallels to zygotic embryo development. Therefore, somatic embryogenesis can be used as a model to further understand the process of zygotic embryogenesis. Combined with transcriptome sequencing, we preliminarily revealed the expression pattern of the LcPIN gene family during somatic embryogenesis of Liriodendron × sinoamericanum. This process is mainly due to the participation of "Canonical PIN". During the embryo development of Arabidopsis thaliana, auxin was excreted mainly by AtPIN1/3/4/7 [43][44][45]. This process is similar to Liriodendron × sinoamericanum. However, the expression of LcPIN members varied at different stages. It is suggested that it's not a single LcPIN gene involved in this process. There is a close evolutionary relationship between the PIN3/4/7 in the evolutionarily. The LcPIN3 could have the function of PIN3/4/7 in the Somatic embryogenesis with the evolutionary status of magnoliaceae. Interestingly, LcPIN2 is also expressed in this process, indicating that LcPIN2 may have some special functions.

Conclusion
Through the above analysis, we analyzed the LcPIN gene family from gene structure, protein properties, transmembrane domains, evolutionary relationships, etc. The gene expression pattern was preliminarily explored in different tissues and Somatic embryogenesis. The speci c expression of LcPIN3 and LcPIN6a in owers was expression expecially, which provided the basis for the veri cation of gene function and the breeding of leydena.

Methods
Identi cation of PIN Genes in the L. chinense genome L. chinense protein sequences were downloaded from (https://hardwoodgenomics.org/genomes/page=2). Arabidopsis PIN protein sequences were acquired from TAIR1.0. We used HMMER3.0 software and the pfam number of PIN (PF03547) to search for LcPIN protein sequences. The corresponding sequences were identi ed based on the BLAST program using the AtPIN1-8 protein sequences as queries. Next, the LcPIN sequences were further authenticated based on the conserved domains using SMART (http://smart.emblheidelberg.de). Biochemical properties, such as the molecular weight (kDa) and isoelectric point (pI) of each protein, were determined using the Compute pI/Mw tool on the ExPASy (https://web.expasy.org/protparam/) website (Table1). The locations of the LcPIN genes on each chromosome were determined based on the L. Chinense genome sequence.
Multiple sequence alignment and phylogenetic analysis of L. chinense PIN gene family The PIN protein sequences of six species (Amborella trichopoda, Arabidopsis thaliana, Zea mays, Oryza sativa, Vitis vinifera and Sorghum bicolor) were downloaded from Phytozome (Additional le 4). The phylogenetic relationships between LcPINs and the PIN proteins from these six species were determined using the neighbor-joining algorithm set to default parameters with 1000 bootstrap analyses, using MEGA 7.0 software (Fig 1). The LcPINs were named according to their respective clades.

Gene structure, chromosomal localization and transmembrane topology analysis of LcPINs
The chromosomal locations of the LcPIN genes were determined based on the L. chinense genome (Fig  2A). Their exon/intron structure was determined with Tbtools using mRNA and the L. chinense genome. The genomic and coding sequences of PIN genes, together with their exon/intron structures, were extracted from the general feature format (GFF3) le of L. chinense sequences (Fig 2B). Protein transmembrane topology was predicted using the TMHHM server 2.0 (http://www.cbs.dtu.dk/services/TMHMM) (Additional le 1).
Multiple sequence alignment and identi cation of conserved motif in PIN gene family of 17 plant species We retrieved the A. thaliana PIN protein sequences from the Arabidopsis Information Resource database (www.arabidopsis.org). A BLASTP search was performed using the AtPIN sequences as query to retrieve PIN sequences from the Phytozome database. Sequences were retrieved from the plant species M. polymorpha, P. patens, S. moellendor i, A. trichopoda, C. kanehirae, Z. mays, S. bicolor, O. sativa, A. thaliana, G. max, G. raimondii, V. vinifera, M. esculenta, C. sinensis, P. trichocarpa and S. lycopersicum (Additional le 4). Sequence alignment was performed using MEGA 7.0. Sequence relationships were inferred using the NJ method. Conserved motifs were checked using the online Multiple Expectation Maximization for Motif Elicitation (MEME) program. The repetition was set as any number with an optimal width of 6-100 residues and the maximum number of motifs as 20.
L. chinense PIN gene family expression analysis and dynamic expression patterns during somatic embryogenesis in Liriodendron × sinoamericanum Genotype:154102 Samples of eight organs (stames, pistil, petal, bark, bud, root, stems, leaf) were taken from the baima test site at Nanjing Forestry university. Total RNA extraction was performed using a chloroform-free plant total RNA extraction kit from BioTeke corporation. qRT-PCR analysis was conducted following standard procedure and using three biological replicates [46]. All primers for qRT-PCR were designed by Primer5.0 and are listed in Table3.
For expression analysis during somatic embryogenesis of Liriodendron × sinoamericanum, materials at different developmental stages (Globular embryo, Heart-shaped embryo, Torpedo embryo, Immature cotyledon embryo, Cotyledon embryo and Plantlet) were collected and transcriptome sequencing was performed. RNA-seq data has not yet been released. Using FPKM data, we constructed expression pro les.

3.Availability of data and materials
The data sets supporting the results of this article are included within the article and its additional les.

4.Competing interests
The authors declare that they have no competing interests 5.Funding     The relative expression of PIN genes in different L. chinense tissues. qRT-PCR experiments were performed on three biological replicates. Error bars represent means SE from three independent experiments. The relative expression level was determined using the expression of LcPIN1a in stamens as a control. Notice that the y-axis has been rescaled across individual graphs for visual representation.