Decoding the molecular mechanism of parthenocarpy in Musa spp. through protein-protein interaction network

Banana, one of the most important staple, delicious fruit among global consumers is highly sterile owing to natural parthenocarpy. Identication of genetic factors responsible for parthenocarpy would facilitate the conventional breeders to improve the seeded accessions. We have constructed Protein-protein interaction (PPI) network through mining differentially expressed genes and the genes used for transgenic studies with respect to parthenocarpy. Based on the topological and pathway enrichment analysis of proteins in PPI network, 12 candidate genes were shortlisted. By exploring the PPI of candidate genes from the putative network, we postulated a putative pathway that bring insights into the signicance of cytokinin mediated CLV-WUSHEL signaling pathway in addition to gibberellin mediated auxin signaling pathway in parthenocarpy. Further validation of candidate genes in seeded and seedless accession of Musa spp using qRT-PCR put forward AGL8, MADS16, IAA (GH3.8), RGA1, EXPA1, GID1C, HK2 and BAM1 as possible target genes in natural parthenocarpy. In contrary, expression prole of ACLB-2 and ZEP is anticipated to highlight the difference in articially induced and natural parthenocarpy. Our analysis is the rst attempt to identify candidate genes and to hypothesize a putative mechanism that bridges the gaps in understanding natural parthenocarpy through protein-protein interaction network. trait is governed by “A” genome 29,31 , while in another study the dynamic nature of AA genome accessions exhibiting both seeded (C4) and seedless traits (PL) 57,58 has also been reported. Among the seedless AA diploid accessions, some are amenable (cv. Matti, cv. Rose) and few are recalcitrant (PL) to seed set upon articial pollination 59 . Thus in the current study, expression pattern of the candidate genes were compared among three ‘AA’ accessions namely i) Calcutta 4 (C4)- with profuse seed set ii) cv. Rose (CVR) - rarely setting seeds upon pollination iii) Pisang Lilin (PL) - seldom setting seeds. Ovary samples from the test accessions were collected at three different time intervals, un-pollinated (UnP), 24hrs after pollination (P24) and 48 hrs after pollination (P48) and subjected to real time expression analysis. The expression study revealed that irrespective of the cultivars, majority of the variations was observed between the seeded and seedless accessions at 24 hrs after pollination (P24). This conrmed that ovary sampling around 24 hrs after pollination is optimum for seed set studies in Musa spp which could be well correlated to our earlier ndings that male gametes reach the ovule within 24 hrs after articial pollination 60 . Genes belonging to MADS family namely MADS16, MADS29, AGL8, LFY were repeatedly found in all the three centrality measures as well as in cluster analysis and thus ranked as key genes. Expression level of the two MADS box transcription factors namely AGL8 and MADS16 were up regulated in C4 and CVR and down regulated in seedless PL at P24 condition.This conrms the negative regulation of MADS box transcription factors upon pollination in natural parthenocarpic accessions like PL. This is in correlation to our previous review that unveiled that upon fertilization, MADS box transcription factors in parthenocarpic accessions could act as a key regulator for the transition between the state of “ovary arrest” to fertilization triggered fruit set that mediate seedless fruit formation 61 . Next to the MADS box transcription factors, proteins an earlier reportbased on comparative genomic approaches, an orthologous gene of HK2 involved in gametophyte development was reported in the genome of of HK2 in parthenocarpy is not elusive, however the constructed PA-PPI network proposed that HK2 interacts with proteins associated to seed development (BAM 1) and cell expansion (SCL, EXPA1).While another candidate gene, BARELY ANY MERISTEM1 (BAM1) received much attention while probing the interaction of HK2in the network. Importance of BAM1 and BAM2 in male and female gametophyte functionality in Arabidopsisfurther emphasized its importance in seedless fruit formation Expression study of histidine kinase (HK2) and BAM1shown that in seedless cultivars (PL and CVR) they were down regulated and the level of expression was much lower in PL compared to CVR. Similar expression prole of these two genes in other seedless fruits such as tomato, eggplant and capsicumfurther evidenced its role in parthenocarpy 78 A Genome wide association study for seedlessness in Musa sp highlighted an orthologous gene to Histidine Kinase (CK1 or HK) is strongly related to female sterility and speculated that SNPs found in this gene might be responsible for parthenocarpy 77 . Our study supported that apart from SNPs, the expression of the gene is also playing a major role in parthenocarpyProbing the interacting partners of HK2in the network shown that protein involved in pollen tube elongation (AT2S3), cell Further extending our search for experimental interactions linked with another candidate gene, BAM1 apparently shown NAP5, CLV3, CLV1 and CPI1 as interacting partners. Among the interacting partners, CLV3 and CLV1 have gained considerable attention since signicant down regulation of BAM1/2 and WUSCHEL (WUS) was observed inthe sexual sterility (Slses) mutant seedless tomato, exhibiting both male and female sterility 56 where CLV succeeded its interaction in the CLV-WUSCHEL signaling pathway that plays a multifunctional role in plant development 82 . WUSCHEL, a transcription factor that mediates the expression of CLV3 and AG during oral development particularly in the ovule and induces integument formation 83,84,85 . Considering everything,it was suggested that interaction of BAM1 with CLV1 mediates CLV-WUSCHEL signaling pathway, in which CLV activates WUSCHEL which results in the transcriptional activation of MADS box transcription factor, AG, leading to seed formation. Further, expression of MADS29was drastically increased upon pollination irrespective of the cultivars however its expression is more in the seeded accession C4, this might be due to the importance of MADS29 in fruit set upon pollination. Increased expression of ACSB2 was evidenced in articially induced seedless tomato 74 , however ACSB2 wasdown regulated in PL and up regulated in C4 and CVR upon pollination at P24. LFY, another candidate gene reported to involve in oral meristem initiation 86 was observed to down regulated in all the three cultivars upon pollination (both P24 and P48). This inferred that LFY might plays a role in oral initiation rather than fruit and seed set. In addition, Ct value of the gene ZEP was undetermined due to its in all three accessionsirrespective of the conditions. These results from banana interrogated the signicance of LFY, ACSB2 and ZEP in natural parthenocarpy. Overall, our results emphasized that genes AGL8, MADS16, RGA1, GIDC1, IAA (GH3.8), EXPA1, HK2, and BAM1could plays signicant role in both induced and natural parthenocarpy. Based number, genomic and parthenocarpic nature of the accessions used in the study. replications NRCB of owering the whole inorescence was bagged before opening of the rst The ovary samples of each were collected from the female oret of C4, CVR and PL on the day of ower opening at 8.00am and designated as un-pollinated (UnP) sample. For pollen grains (Male), Matti (AA) a local landrace collected from Thriunelveli, Tamil maintained at ICAR-NRCB with accession number The pollen grains were collected during anthesis at 7.00 am from the accession Matti (Male), dusted over the stigma of the female orets (C4, CVR and PL) and the whole inorescence was The ovary samples were collected from the female orets at 24 hrs and 48 hrs after pollination and designated as P24 and P48 samples. The collected ovary samples were and


Introduction
The term parthenocarpy refers to ovary developing into a seedless fruit in the absence of union of female and male gametes. Ithas been reviewed in large number of horticultural crops such as grape, tomato, mandarins, banana, opuntia, pepino, eggplant, cucumber and capsicum 1 and stated that parthenocarpy can be achieved as a result of over expression of endogenous hormones in the ovary 2 and can be genetically controlled 3,4 . From the inheritance pattern of parthenocarpy in various crops, it has been reported that the trait parthenocarpy is governed by a single dominant gene in eggplant 5,6 , single recessive gene like in Capsicum annum 7 , more than two recessive genes in tomato 8,9 , a single dominant gene in pepino 10 , a single incompatible dominant gene in cucumber 11,12 , two major additive, dominant-epistatic genes in cucumber 13 , three independent but complementary dominant genes in banana 14,9 . Phytohormones such as auxin and GA (Gibberellin) playing predominant roles in parthenocarpic fruit development such as tomato 15 , Arabidopsis 16,17 , apple 18 etc., It is also being commercially exploited in horticulture crops 19 through exogenous use of irradiated pollen, natural or synthetic hormones such as auxins, gibberellins, IAA etc., during ovary development 20,21,22 . In spite of so many reports, the molecular mechanism involved in natural parthenocarpic fruit development is still unclear and candidate genes for the trait parthenocarpy have not been identi ed till date. To understand the molecular mechanism involved in parthenocarpic fruit development, comparative transcriptome analysis has been studied between parthenocarpic and non-parthenocarpic (seeded) accessions in many horticultural crops such as eggplant 23 , citrus 24 , litchi 25 , oil palm 26 etc.
Many researchers have tried to identify the parthenocarpic mechanism by studying the expression pro le of induced parthenocarpic fruit either via exogenous application or through mutation or genetic transformation 27 , 28 .
Among the horticulture crops, banana an economically important crop but its seediness hinders its improvement through conventional breeding approach. Unlike in other crops, ploidy status, intra and inter speci c hybridity nature of commercial cultivars/varieties have led to chromosomal imbalance during gamete formation that plays a determinant role inseedless fruit formation. Only limited studies are available for understanding the genetics of parthenocarpy in banana and plantains. It has been stated that the trait parthenocarpy in banana is governed by three independent complementary genes in which the absence of even one dominant gene that resulted in seediness 29 . Similarly, based on the segregating pattern, it has also substantiated that parthenocarpy is governed by three genes 30 . Further it has been postulated that among the ancestor genome (A and B) of the present day commercial cultivars, "A" genome coming from Musa acuminata (AA) contributes to the female sterility resulting in vegetative parthenocarpy 31,32,30 . However the loci or the genetic factors responsible for the trait parthenocarpy are not yet identi ed because of their inherent nature like male and or female sterility, heterozygous nature of parents, unreduced gamete formation etc.
The lack of availability of data associated to seeded and seedless accessions of Musa spp hampered perceiving the knowledge on genetic mechanism/factors involved in parthenocarpy. In such scenario, "omics" information related to parthenocarpic trait of various species which are hugely deposited in public databases could be exploited through computational approaches. Several in-silico methods such as sequence similarity, evolutionary relationship, detection of SNPs, high throughput gene expression analysis and protein-protein interactions (PPI) etc., could be applied for identifying the genetic factors responsible for parthenocarpy in Musa spp.Of which computational prediction of PPI from the gene expression pro les has been widely implemented for the prediction of candidate genes that regulate any complex trait 33 . Hence in this study we focused on "proteogenomics" approach by mining the differentially expressed genes (DEGs) of seeded and arti cially induced parthenocarpic fruits of various crops, tomato, eggplant, capsicum, grapes, citrus, apple etc., for the identi cation of candidate genes responsible for parthenocarpy in Musa spp. Genetic factors from various orthologous species involved in the parthenocarpic fruit formation and their respective homologous genes in Musa spp were taken for the construction of PPI network for the trait parthenocarpy, since it is evidenced that PPIs are conserved in different orthologous species 34 . The shortlisted genes were validated in seeded and seedless accessions of banana to identify the candidate genes for natural parthenocarpy in banana.

Results
Construction of PA-PPI network network has a characteristics average path length value of 5 and comprised 40% shortest paths. This outlined the overall navigability of the network that the biological information in the network could get transferred by crossing few nodes from a selected protein to others in the network 36 . The clustering co-e cient of this scale free network is 0.283, that signi cantly describes that the internal structure of this network is highly interactive and form clusters.
Topological Analysis Of Pa-ppi Biological signi cance of proteins in this scale free network was determined by analyzing the centrality measures (topological properties) such as degree, betweenness and closeness centrality. Based on the topological properties of the PA-PPI network, top ten proteins with higher degree, higher betweenness centrality scores and higher closeness centrality score were taken and listed in Table 1. The average degree of proteins in the PA-PPI network was found to be 3.128 and proteins with high degree (> 10 interacting partners) such as LFY, ZEP, HK2, EXPA1 and SL1 are referred as degree based hubs. Proteins with higher betweenness centrality scores such as NIA1, ZEP, FL, NCED1, MOCOS could act as useful indicators for detecting bottleneck protein in the PA-PPI network. LFY, FIE2, GAF1, NFYB9, ZEP with high closeness centrality have a smaller path length to reach all other proteins in the network and thereby these proteins would have a greater in uence in the network. In the overall topological analysis, it was found that some of the genes such as LFY, ZEP, EXPA1, SL1, GH3.8, BAM1,HK2 etc., were repeatedly present in all the three centrality measures and thus these genes were shortlisted as potential candidate genes. Cluster Analysis Highly interconnected regions or sub network in PA-PPI were identi ed using MCODE plug-in since clusters in a network are often protein complexes which involved in the same pathway and the same protein family.Totally eight clusters were obtained of which only 3 clusters having a score of > 3 were subjected to functional enrichment analysis using ShinyGO (Fig. 1&[Supplementary Fig. S2A]. Cluster 1 comprises of genes belonging to the biological process such as "Response to hormone-mediated pathway, Response to hormonal/chemical stimulus, Regulation of multicellular organismal development and Regulation of gene expression" etc., [ Supplementary Fig. S2B]. Cluster 2 encompasses genes belonging to embryo, seed sac development, mitotic cell cycle, Gametophyte development, reproductive processes etc. Interestingly genes belongs to cluster 3 involved in Carbohydrate metabolic process, Oxido-reduction coenzyme metabolic process, Energy reserve metabolic process etc.,[ Supplementary Fig. S2C]. The unique rankings of genes based on each centrality measures and MCODE clusters analysis are given in Table 1 and the genes belongs to each clusters as a result of MCODE plugin are given in [ Supplementary TableS2].

Validation Of Candidate Genes For Parthenocarpy
Comprehensive analysis of the network highlighted various hubs and essential genes that could directly or indirectly associated with parthenocarpy were mined. Based on the following observations, nodes from the PPI networkwere taken as candidate genes

Discussion
Identi cation of candidate gene/s responsible for the trait parthenocarpy in Musa spp. is a prerequisitefor the commercial exploitation of accessions with B genome that exhibit resistance against biotic or abiotic stresses 41 . It was reported that the trait parthenocarpy in banana is governed by three independent complementary geneswhere seediness could result even with the loss of one dominant gene 29,30 . But till date, only limited research has been attempted in banana to understand the genetic basis of natural parthenocarpy. Further candidate genes documented from various studies and crops are related to arti cially induced parthenocarpy but not on natural parthenocarpy as in banana.Thus computational method of protein-protein interaction networks based data mining was employed for the identi cation of candidate genes in banana 42,43,44,45&46 . Since most of the biological activities in the cell are result of molecular interaction/s among bio-molecules and thus study of PPI is essential to unravel the molecular basis of any complex traits 47 .Differentially expressed genes and genes with respect to parthenocarpy from various crops were mined and their respective orthologous genes in Musa spp were retrieved for the construction of parthenocarpy associated PPI in consequent to a study that orthologous sequences are ought to have the same functions 48 .[ Supplementary Fig. S1].
Down regulation Down regulation Functional enrichment of the proteins in the overall network revealed that majority of the genes are primarily involved in oral whorl development (26.32%), meristem maintenance (15.79%), regulation of reproductive process (10.53%), transcriptional regulation, gene regulation (10.52%) and oligosaccharide biosynthetic process (10.53%). This is turn supported the relevance of orthologous genes short listed for the construction of PA-PPI network in the current study since many studies reported the signi cance of genes involved in oral development, ovule integument, reproductive process etc., in seedless fruit formation 49,50 . Considering the functional and GO analysis of PA-PPI network together, it was shown that majority of the genes that framed the PA-PPI are involved in "regulation of cellular macromolecule biosynthetic process" and "transcriptional regulatory activity"[ Supplementary Fig. S3]. Particularly transcription factors that are widely involved in the ower -fruit transition stage like AGL8, MADS16,LFY and MADS29get highlighted as a result of centrality measures. Similarly KEGG pathway analysis rationalized that hormone signal transduction, carotenoid biosynthesis, fatty acid metabolism, carbohydrate metabolism and lysine degradation pathways having a strong association in the network of natural parthenocarpy (Fig. 4).In our previous review, the role of hormone mediated transcriptional regulation in parthenocarpy was emphasized 61 , while the current study highlighted involvement of proteins in carbohydrate, fatty acid and lysine degradation pathways, however their interrelation in inducing parthenocarpy remains elusive [Supplementary Table S3]. It has been speculated that some of the discrete nature of parthenocarpic fruits such as its nutritional value, pulp content, fruit size, peel thinness etc., might be due to the cellular metabolism that occurs in parthenocarpic fruit formation.
While considering the role of lysine degradation pathway, it was found that glycine and carnitine are the end products which incite us to acquire information regarding free amino acid (FAA) content difference in parthenocarpy and seeded varieties of Musa spp. Variation in the level of FAA between parthenocarpy and seeded traits were analyzed in tomato and capsicum but still the possible role of FAA content in parthenocarpic fruit formation is yet to prove. On the other hand, genes namely MEA, FIE, CLF and SWNare group of polycomb (PcG) proteins are also get highlighted under lysine degradation pathway and their interrelation in inducing parthenocarpy remains elusive.
PcG proteins are act as histone modifying enzymes and reported to regulate the embryo and endosperm proliferation and anteroposterior organization during seed development 51 . Functional mutation of FIE 52 FIS 53 and MEA 54 were also observed in parthenocarpic fruit development 55 . In additioninteraction of MADS box TFs namely AP2, AGL15, AGL2/EMF, with PcG proteins namely MSI1, FIE, SWN and VEL1 through VRN5 (Vernalization 5) and TPL (transducing family protein/WD-40 repeat family protein) was found in the putative network. Down regulation of VRN5, TPL as well MADS TFs and PcG proteins were already reported in parthenocarpic fruit development 56 suggesting their promising role.Despite these reports, the direct role of polycomb (PcG) proteins in lysine degradation pathway remains unclear in seed development and understanding the integration of these genes and the pathways in parthenocarpy is the key challenge. Another unpublished work at ICAR-NRCB, reported failure of certain female fertile accessions to set seeds under a set of environmental condition but the reason behind this behavior remains undiscovered. Thus it is speculated that PcG might be involved in epigenetic mechanism that regulates the seed formation under speci c environmental conditions.
Candidate genes mined as a result of network topological analysis were validated through expression analysis and their interacting partners were exploredusing Bisogenet app in Cytoscape in order to fathom their mechanism in parthenocarpy. Some studies reported that parthenocarpy trait is governed by "A" genome 29,31 , while in another study the dynamic nature of AA genome accessions exhibiting both seeded (C4) and seedless traits (PL) 57,58 has also been reported. Among the seedless AA diploid accessions, some are amenable (cv. Matti, cv. Rose) and few are recalcitrant (PL) to seed set upon arti cial pollination 59 . Thus in the current study, expression pattern of the candidate genes were compared among three 'AA' accessions namely i) Calcutta 4 (C4)-with profuse seed set ii) cv. Rose (CVR) -rarely setting seeds upon pollination iii) Pisang Lilin (PL) -seldom setting seeds.
Ovary samples from the test accessions were collected at three different time intervals, un-pollinated (UnP), 24hrs after pollination (P24) and 48 hrs after pollination (P48) and subjected to real time expression analysis. The expression study revealed that irrespective of the cultivars, majority of the variations was observed between the seeded and seedless accessions at 24 hrs after pollination (P24). This con rmed that ovary sampling around 24 hrs after pollination is optimum for seed set studies in Musa spp which could be well correlated to our earlier ndings that male gametes reach the ovule within 24 hrs after arti cial pollination 60 . Genes belonging to MADS family namely MADS16, MADS29, AGL8, LFY were repeatedly found in all the three centrality measures as well as in cluster analysis and thus ranked as key genes. Expression level of the two MADS box transcription factors namely AGL8 and MADS16 were up regulated in C4 and CVR and down regulated in seedless PL at P24 condition.This con rms the negative regulation of MADS box transcription factors upon pollination in natural parthenocarpic accessions like PL. This is in correlation to our previous review that unveiled that upon fertilization, MADS box transcription factors in parthenocarpic accessions could act as a key regulator for the transition between the state of "ovary arrest" to fertilization triggered fruit set that mediate seedless fruit formation 61  reported in several studies related to parthenocarpy 18 . It is also proved that external application of GA competes DELLA for its interaction with GID1C leading to subsequent degradation of DELLA eventually resulting in seedless fruit formation 70 . Further it is interesting to note that, upon pollination, there was reduction in the expression level of GID1C, where the reduction was drastic in CVR followed by C4 and PL 69 proved the interaction of GID1C with DELLA which is essential for seed development by studying the various mutants of GID1C and DELLA in Arabidopsis. When associating the expression pro le of these four genes namely AGL8, MADS16, DELLA and GID1C together with their interacting partners from the network, strongly supported the proposed hypothesis that MADS box TFs together with DELLA could act asa focal point in bridging auxin and GA hormone during seed set.Down regulation of IAA (GH3.8) was observed in the ovaries of both CVR and PL at P24 which reiterated its expression in hormone induced parthenocarpy. Down-regulated expression of AUX/IAA and homologs of GH3.2was also reported in parthenocarpic eggplant over the seeded eggplant 33 . IAA (GH3.2) is an auxin-amino acid conjugating enzyme that converts auxin into an inactivate form, reported that homolog of GH3.2, (i.e IAA-amino synthetase (GH3.8)) down regulates auxin signaling by preventing the accumulation of free IAA 71,72 . Recent study on DELLA and its interaction with ARF7/IAA9, shown that GH3.2 is involved in the regulation of auxin homeostasis through GA during fruit development 18 .This supports our earlier review that increased GA together with decreased expression of auxin responsive genes like GH3.2resulting in seedless fruit formation 61,73,74 .GA in coordination with IAA was reported to increase the expression of EXPA1 and thereby promoted cell division and cell expansion 74,75 .
Expansin (EXPA1), a growth promoting gene involved in cell division and cell growth, subsequently required for fruit development. GA induced parthenocarpy showed increased expression of expansin in the ovaries of pear fruit suggesting that genes involved in cell expansion, cell division get activated upon hormonal signaling for fruit 27 . SimilarlyEXPA1 isup regulated in PL whereas it is down regulated in both C4 and CVR at P24. This implies that upon pollination, EXPA1could positively mediate natural parthenocarpy in Musa spp. Histidine Kinase CKI1 (HK2), a cytokinin receptor reported to be involved in the biological process of cytokinin signaling pathway, ower development, seed germination, meristem development and in several stress related mechanism 76 . In an earlier reportbased on comparative genomic approaches, an orthologous gene of HK2 involved in gametophyte development was reported in the genome of Musa 77 . Association of HK2 in parthenocarpy is not elusive, however the constructed PA-PPI network proposed that HK2 interacts with proteins associated to seed development (BAM 1) and cell expansion (SCL, EXPA1).While another candidate gene, BARELY ANY MERISTEM1 (BAM1) received much attention while probing the interaction of HK2in the network. Importance of BAM1 and BAM2 in male and female gametophyte functionality in Arabidopsisfurther emphasized its importance in seedless fruit formation 58 . Expression study of histidine kinase (HK2) and BAM1shown that in seedless cultivars (PL and CVR) they were down regulated and the level of expression was much lower in PL compared to CVR. Similar expression pro le of these two genes in other seedless fruits such as tomato, eggplant and capsicumfurther evidenced its role in parthenocarpy 78 . A Genome wide association study for seedlessness in Musa sp highlighted that an orthologous gene to Histidine Kinase (CK1 or HK) is strongly related to female sterility and speculated that SNPs found in this gene might be responsible for parthenocarpy 77 . Our study supported that apart from SNPs, the expression of the gene is also playing a major role in parthenocarpyProbing the interacting partners of HK2in the network shown that protein involved in pollen tube elongation (PI4KB1) 79 , brassinosteroid (BR) integrated seed development (NAC081) 80 , seed storage (AT2S3), microtubule-associated protein involved in cell expansion (DRP1A) 81 get interacts with HK2 which in turn clearly evidenced its role in seed development. Further extending our search for experimental interactions linked with another candidate gene, BAM1 apparently shown NAP5, CLV3, CLV1 and CPI1 as interacting partners. Among the interacting partners, CLV3 and CLV1 have gained considerable attention since signi cant down regulation of BAM1/2 and WUSCHEL (WUS) was observed inthe sexual sterility (Slses) mutant seedless tomato, exhibiting both male and female sterility 56 where CLV succeeded its interaction in the CLV-WUSCHEL signaling pathway that plays a multifunctional role in plant development 82 . WUSCHEL, a transcription factor that mediates the expression of CLV3 and AG during oral development particularly in the ovule and induces integument formation 83,84,85 . Considering everything,it was suggested that interaction of BAM1 with CLV1 mediates CLV-WUSCHEL signaling pathway, in which CLV activates WUSCHEL which results in the transcriptional activation of MADS box transcription factor, AG, leading to seed formation. Further, expression of MADS29was drastically increased upon pollination irrespective of the cultivars however its expression is more in the seeded accession C4, this might be due to the importance of MADS29 in fruit set upon pollination. Increased expression of ACSB2 was evidenced in arti cially induced seedless tomato 74 , however ACSB2 wasdown regulated in PL and up regulated in C4 and CVR upon pollination at P24. LFY, another candidate gene reported to involve in oral meristem initiation 86 was observed to down regulated in all the three cultivars upon pollination (both P24 and P48). This inferred that LFY might plays a role in oral initiation rather than fruit and seed set. In addition, Ct value of the gene ZEP was undetermined due to its in all three accessionsirrespective of the conditions. These results from banana interrogated the signi cance of LFY, ACSB2 and ZEP in natural parthenocarpy. Overall, our results emphasized that genes AGL8, MADS16, RGA1, GIDC1, IAA (GH3.8), EXPA1, HK2, and BAM1could plays signi cant role in both induced and natural parthenocarpy. Based on the illustration of possible interactions associated with candidate genes in the PA-PPI network together with their expression study (Fig. 3), we designed a putative pathway that is speculated to be the underlying parthenocarpy mechanism in Musa (Fig. 5).

Materials And Methods
The approach used in this study for prioritizing key genes in parthenocarpy is summarized in Fig. 6 and described in the following sections.

Mining Of Genes Associated With The Trait Parthenocarpy
Genes associated with parthenocarpy in other crops were mined from databases like Uniprot, KEGG and sources like Pubmed, Pubmed Central, etc. This is achieved through manual text mining by using the query words "parthenocarpy', "seedlessness", "parthenocarpy and genes", "parthenocarpy and transcription factors", "parthenocarpy and Musa". In addition, highly enriched differentially expressed genes (DEGs) between seeded and arti cially induced parthenocarpic fruits either through phyto-hormone or chemical spray/mutation/genetic transformation in various crops such as tomato 87 , eggplant 33 , apple 28 , pear 28 etc., were extracted from their respective transcriptome pro les [Supplementary Material 1].

Retrieval of orthologous sequences in Musa spp
The corresponding sequences pertaining to the mined parthenocarpic genes were downloaded in fasta format either from Uniprot (https://www.uniprot.org/) or from the respective crop speci c genome or transcriptome databases using their unique reference gene ID cited in the literature. These sequences were then submitted to BLAST search in Banana Genome hub (http://banana-genome-hub.southgreen.fr/) 88 in order to retrieve corresponding orthologous sequences in Musa species which has ≥ 70% sequence identity [Supplementary Material 1].

Construction Of Protein-protein Interaction Network (ppi)
The retrieved Musa orthologous sequences were submitted to STRING v10.5 (https://string-db.org/), a pre-computed database for the exploration of PPI 89 . Predicted protein association networks with a combined score of > 0.4 were taken for the construction of PPI network using Cytoscape 3.7.1 90 . Since the initial PPI network constructed using STRING database had limited number of nodes (proteins) and edges (interactions) for the study, we extended our search of possible interacting partners for the extracted genes using the plugin called Agilent Literature Search 91 in Topological and cluster analysis of the network The extended PPI network is considered as an undirected graph (G) constituting the components V and E, in which proteins are denoted as nodes (V) and the interactions are represented as edge (E). In the current study, to identify key proteins from the network, topological properties such as degree (k), betweenness centrality (BC) and closeness centrality (CC) were analysed. These three different centrality measures were calculated using CytoHubba, a Cytoscape plugin (Chin et al. 2014) that explored nodes with high degree, high BC and CC to identify the important proteins related to parthenocarpy from PPI network. Cluster analysis was performed using Molecular Complex Detection (MCODE) plug-in which provides a novel clustering algorithm to screen the modules of the PPI network for parthenocarpy (PA-PPI) through Cytoscape (Bandettini et al. 2012). MCODE scores of > 3 and the number of nodes > 3 were set as cutoff criteria with the default parameters (Degree cutoff ≥ 2, Node score cutoff ≥ 2, K-core ≥ 2 and Max depth = 100). Genes identi ed from the clusters and the top ten genes from topological analysis were subjected to Gene ontology (GO) and KEGG pathway enrichment analyses using BinGO, ClueGO and BLAST2GO in order to expedite the functional annotation of each genes 92 .

Identi cation And Collection Of Plant Materials
The test samples were collected from the eld Musa genebank of ICAR-National Research Centre for Banana (NRCB), Tiruchirapalli, Tamil Nadu, India where more than 300 Indian accessions and 121 exotic accessions are being maintained. The tissue cultured propagules (AA genomic group) of the seeded accession (Calcutta 4 (C4)) and parthenocarpic accessions (Pisang Lilin (PL); and cv. Rose (CVR)) were received from the International Transit Centre (ITC), Belgium through ICAR-National Bureau of Plant Genetic Resources (NBPGR). Exotic collections (EC) numbers were given to the exotic introductions by ICAR-NBPGR and the details for the three cultivars used in the current study are provided below in the Table 3. Table 3 Accession number, genomic and parthenocarpic nature of the accessions used in the study.

Conclusion
In the present study, as a result of comprehensive network analysis we short listed 12 candidate genes HK2, EXPA1, BAM1 (LRR), DELLA (RGA1), GID1C, GH3.8, ZEP, ACLB2, LFY, MADS16, MADS29 and AGL8associated to parthenocarpy. By probing the interaction of validated candidate genes we hypothesized a hormone mediated pathway involved in parthenocarpy. Negative regulation of MADS boxTFs (AGL8 and MADS16)together withDELLA, GH3.8 andthe proposed pathway of GA mediated auxin induced seedless fruit formation in our previous work has now been supported through their expression analysis in cvs. C4, CVR and PL. in natural parthenocarpy (Musa spp.). In view of the fact that interaction of GID1C with DELLA is a prerequisite for seed formation, drastic reduction GID1C together with moderate expression of DELLA might be the reason behind seed set in CVR upon pollination. The signi cance of cytokinin mediated CLV-WUSHEL signaling pathway in parthenocarpy is exposed through the interaction of BAM1, IAA (GH3.8) and HK2 in PA-PPI network and expression of these genes in Musa spp a rmed their role in natural parthenocarpy. However, expression of CLV1 needs validation for further con rmation. Besides, the variation in the expression of EXPA1, ACLB2, MADS29, LFY in natural and arti cial parthenocarpy, emphasize the need for further validation in other diploid, triploid and tetraploid accessions of different genomic combinations. In addition, transcriptome analysis of seeded, poor and no seed setting accessions would enlighten the mechanism of natural parthenocarpy. In a nutshell, we suggest that genes MADS16, AGL8, DELLA, GID1C, IAA (GH3.8), HK2, BAM1 and CLV1 could be the possible target genes for manipulation of seeded accessionstoparthenocarpy.

Contributions
US and SB conceived, design the work and provided guidance in manuscript preparation; SB involved in thematic guidance to bring out the work layout and in manuscript preparation; SR constructed PPI network and identi ed candidate genes through in-silico approaches and SS validate candidate genes using qRT-PCR. SR, SB and SS interpreted the results. SR, SB and SS wrote the manuscript paper with input from all authors.

Corresponding author
Correspondence to Uma Subbaraya Ethics declarations

Con ict of interest
The authors declare that they have no con ict of interest.

Funding
This study was supported by DBT-North east project entitled "Consortium for managing Indian banana genetic Resources -DBT-NER/AGRI/33/2016(Application Number-90)" India.