According to the assembled and refined transcriptome, a total of 142,025 contigs (x̅ = 1,044bp, N50 = 1,133bp) were obtained which 106,603 contigs correspond to trinity “genes” unigenes. Within assembled contigs the size ranged from 501 bp to 36,186 bp long. In other studies, as Actinidia eriantha [22], the assembly of filtered reads reached a total of 69,396 unigenes obtaining sizes between 201 and 9,602 bp, which are indicating a larger number of contigs and size in this study.
As for the completeness of ultra-conserved protein evaluation by CEGMA and BUSCO, the scores suggest that the assembled transcriptome contains an important number of ultra-conserved proteins complete or fragmented, and a minor proportion of them missing, meaning that the transcriptome reached a high-quality standard in terms of completeness. However, CEGMA and BUSCO complete scores over 95% have been reported for twelve plant genomes including the model plant Arabidopsis thaliana and the fruit tree Pyrus communis L. var ‘Bartlett’ [23].
Thus, this biological approach is indicating the highest quality score specially for BUSCO which reach almost 70% of complete genes found from the dataset. This 70% from BUSCO score seem to be lower against to quality scores reported above in different plant genomes but the construction of de novo transcriptome from three different plant tissues may be explain this difference. Anywise, the sum of the values of complete and fragmented genes is close to 90%, which is quite high considering the construction of a de novo transcriptome. In addition, despite the fact that we are talking about different species, in a specific tissue de novo transcriptome assembly of Ilex paraguariensis [24] obtained around 73% of complete genes reaching close to 85% if we include the fragmented genes being this quality score value similar to our transcriptome assembly.
At this moment, the high throughput sequencing (HTS) it is becoming more and more accessible and friendly, as this happens, more species are getting sequenced for multiple purposes including gene expression profiling, epigenomics, genomics, and transcriptomics approaches [25].
In spite of everything, there is still species with economic relevance that do not have this omics data resources yet. This is the reason why assembling a high-quality transcriptome becomes an important task when comes to studying those species. However, assembling a high quality de novo transcriptome depends mainly in the quality, quantity and software parameters, and even if the transcriptome was assembled, it represents a challenge in terms of determining how good it is assembled.
As a result of annotation process against the TAIR10 protein database, only 35.6 % of contigs were annotated, therefore a 64.4% remained without score. This low match could be explained because
there is a lack of information in protein databases, which implies a lower knowledge in the Actinidia genus at protein level. Other studies in kiwifruit Actinidia deliciosa var ‘Jinkui’ [26] obtained 140,187 unigenes of which 56,912 were functionally annotated while [9] obtained in Actinidia arguta 51,745 unigenes and 30,439 matches to known proteins. More recently, a RNA-seq for different fruit tissues was carried out in Actinidia eriantha reaching 69,783 non-redundant unigenes and 21,730 were annotated in different protein database [22]. As we have seen, the sequencing results of different kiwi species including A. deliciosa, A. arguta and A. eriantha show important differences at the transcriptome level.
As for gene annotation and enrichment analysis by Panther, we obtained significative functional annotations from each tissue comparisons including shoot vs leaf, flower bud vs flower and for fruit development. As for shoot vs leaf comparison the top hits for biological processes were related to metabolic process (GO:0008152) or cellular component organization (GO:0071840) while for molecular function was molecule binding (GO:0005488). Therefore, these processes may be involved in the biosynthesis of constituent macromolecules and plant cell related to leaf development and growth. As for bud vs flower and fruit development (fruit7d vs 50d–120d–160d) the top hit for biological processes was metabolic process (GO:0008152) while for molecular function the catalytic activity (GO:0003824) was the most significative for both tissue comparison which is indicating an increase of chemical reactions linked to flowering and fruit development.
In agreement with the GO terms annotations by AgriGO v2 only eight GO terms were found for shoot vs leaf may be due to the lack of protein annotations related to these tissues. However, bud vs flower and fruit development comparisons showed 119 and 56 GO terms respectively. Thus, for flower bud vs flower, some significant protein IDs were related to anatomical structure development (p-value of 6.5e–27; GO:0048856), reproductive process (3.4e–12; GO:0022414), aromatic compound biosynthesis process (7.4e–07; GO:0019438) or response to abiotic stimulus (1,00E–07; GO:0009628). Therefore, GO:0048856 is related to the progression of anatomical structures as flower bud to mature flower while proteins network involved in GO:0022414 are contributing in the reproductive process related to inheritance of genetic material from the parents which it’s happens during flower development and it’s been recently related to cytoplasmatic male sterility in soybean flower buds [27]. Moreover, aromatic compound biosynthesis process (GO:0019438) includes all of chemical reactions and pathways related to the formation of aromatic compounds which can be happening during flower development, so the aromatic composition of the kiwifruit could be forming during flower development as in Eucalyptus grandis floral tissues [28]. As for proteins network related to response to abiotic stimulus (GO:0009628), it is related to flower development which may be is conditioned by abiotic stresses as it was been recently reported in flower buds of transgenic blueberry [29].
As for fruit development (fruit7d vs 50d–120d–160d), some protein IDs were associated to protein metabolic process (4.1e–3; GO:0019538), DNA polymerase activity (3.9e–10; GO:0034061), transferase activity (6.1e–06; GO:0016740) or catalytic activity (1.6e–03; GO:0003824). In other plant species as cucumber [30] some of the main proteins linked to fruit development are involved in the processes of the protein metabolism (GO:0019538). In addition, some of the most important proteins involved to S2 fruit development stage in peach (cell enlargement) are involved to transferase and catalytic activities [31].
Therefore, a major catalysis reaction and an increase of enzymatic activity seem to be more related to fruit development (GO:0034061, GO:0016740 and GO:0003824). Similar approach was implemented in a de novo assembly of Persea americana cv. ‘Hass’ (avocado) transcriptome during fruit development where proteins related to fruit oily characteristics were predominant [32].
Finally, we have to take into account that the lack of information regarding protein annotations, suggest that more information would be needed to deepen in the processes related to vegetative growth of leaves or the reproductive development of the flower and fruit. However, the results obtained through the gene enrichment analysis suggest certain behaviors that can be adapted to each tissue.