De novo sequencing and analysis of the transcriptome of two highbush blueberry (V. corymbosum L.) cultivars ‘Bluecrop’ and ‘Legacy’ at harvest and following post-harvest storage

Background: Fruit rmness and in particular the individual components of texture and moisture loss, are considered the key quality traits when describing blueberry fruit quality, and whilst these traits are genetically regulated, the mechanisms governing their control are not clearly understood. In this investigation, RNAseq was performed on fruits of two blueberry cultivars with very different storage properties, ‘Bluecrop’ and ‘Legacy’, at harvest, three weeks storage in a non-modied environment at 4 o C and after three weeks storage at 4 o C followed by three days at 21 o C, with the aim of understanding the transcriptional changes that occur during storage in cultivars with very different post-harvest fruit quality. Results: De novo assemblies of the transcriptomes of the two cultivars were performed separately and a total of 39,335 and 41,896 unigenes for ‘Bluecrop’ and ‘Legacy’ respectively were resolved. Differential gene expression analyses were grouped into four cluster proles based on changes in transcript abundance between harvest and 24 days post-harvest. A total of 264 unigenes were up-regulated in ‘Legacy’ and down-regulated in ‘Bluecrop’, 103 were down-regulated in ‘Legacy’ and up-regulated in ‘Bluecrop’, 43 were up-regulated in both cultivars and 355 were down-regulated in both cultivars between harvest and 24 days post-harvest. Unigenes showing signicant differential expression between harvest and following post-harvest cold-storage were grouped into classes of biological processes including stress responses, cell wall metabolism, wax metabolism, calcium metabolism, cellular components, and biological processes. Conclusions: In total 21 differentially expressed unigenes with a putative role in regulating the response to post-harvest cold-storage in the two cultivars were identied from the de novo transcriptome assemblies performed. The results presented provide a stable foundation from which to perform further analyses with which to functionally validate the candidate genes identied, and to begin to understand the genetic mechanisms controlling changes in rmness in blueberry fruits post-harvest.

Fruit texture is the most important quality trait from a consumer perspective [7,8]. Retention of rmness is widely regarded as the main attribute required for long term storage, and as such it ultimately determines the economic success of blueberry varieties in the marketplace [7,9]. Softening during storage compromises the nal eating quality of the fruit, resulting in fewer repeat purchases by consumers and subsequently reduced sustainability for the blueberry industry. Changes in fruit rmness during post-harvest storage are in uenced by many different factors including water loss, the effect of temperature and relative humidity, variation in cell turgor, cell wall composition, the effects of high CO 2 and low O 2 , stem scar size, microstructural changes in cell and tissue layers within the fruit, and skin rmness; all of which have been reported to be genetically regulated [10][11][12][13][14].
Among these factors, water loss, which is primarily in uenced by temperature and humidity, has been reported as the main driver of fruit softening [11,13]. It has been shown that moisture losses below 1.34% in blueberry fruits increased fruit rmness during post-harvest storage [13], whilst signi cant reduction in rmness was observed in fruit where moisture loss was between 4% and 15% [10,13]. Other factors shown to have a signi cant impact on fruit softening include cell organization which impacts texture through changes in cell to cell contact, differences in cell types and size, and thinning or degradation of epidermal layers [11,15,16]. Softening during post-harvest cold storage has also been shown to be a consequence of changes in cell wall constituents [17] where a reduction in rmness, or increased softening, was positively correlated with an increase of water soluble pectins and weight loss, and a decrease in cellulose and hemicellulose. Giongo et al., 2013 [7] characterized the texture of a range of blueberry cultivars at different developmental and post-harvest stages and identi ed a number of different mechanical pro les that described the variation in fruit texture. These pro les included measurements of fruit rmness, and a storage index which was used to describe the potential storage performance of the different cultivars. Positive storage index values characterised cultivars that retained good texture during storage, whilst negative values indicated a loss of texture quality during storage.
Owing in part to the tetraploid nature of its genome, genomics resources for cultivated blueberry have not been developed as quickly as in other fruit species such as peach [18] and grapevine [19]. The genome size of the diploid blueberry has been estimated to be in the region of 600 Mb [20] and a genome sequence consisting of 13,757 contigs assembled from publicly-available sequence data for diploid blueberry has been described [21], but was not made available for public scrutiny. More recently however, the rst chromosome-scale genome assembly of the cultivated tetraploid highbush blueberry was released [22] consisting of 48 pseudomolecules containing 1.68 Gb of assembled sequence, and an average of 32,140 protein coding genes per haplotype (128,559 total). The availability of this resource will facilitate further structural and functional genomics studies in cultivated blueberry.
Annotation of the unreleased draft blueberry genome assembly permitted an RNAseq analysis and identi cation of gene predictions putatively involved in fruit ripening and the biosynthesis of bioactive compounds [21]. Other RNAseq studies in blueberry have characterised changes in gene expression in various tissues under a range of environmental conditions and stresses. These include a study of blueberry root tissue samples with differing tolerance to soil pH changes [23], the identi cation of genes regulating the anthocyanin content of blueberry fruit during ripening [24], and genes associated with changes in fruit rmness during ripening and harvest [25]. Recently, Zhang et al., 2020 [26] reported the analysis of the transcriptome of the blueberry cultivar 'Duke' in response to cold-stress. The authors performed RNAseq experiments on fruit of 'Duke' at harvest and following 30 days storage at 0 degrees C, as well as several physiological assessments. Of the 35,060 unigenes recovered, 1,167 genes were upregulated following cold-storage, whilst 685 were down-regulated and a range of genes with Log2FC above +/-2.0 were annotated, including genes associated with plant hormone transduction, carotenoid biosynthesis, glutathione metabolism, starch and sucrose metabolism, protein processing and porphyrin and chlorophyll metabolism.
In this study RNAseq was used to determine gene expression pro les in fruits of two blueberry cultivars that exhibit different textural change pro les during post-harvest storage [7]; 'Legacy', which retains a rm texture and 'Bluecrop', which becomes soft during storage. Samples were pro led at harvest, following cold storage for 21 days and after 21 days cold storage [4 °C] followed by 3 days storage at room temperature [21 °C]. In order to maximise transcript capture for expression analysis, separate de novo unigene assemblies were performed for cvs. 'Bluecrop' and 'Legacy' and RNAseq reads obtained were mapped to each unigene set independently to identify genes differentially expressed during storage in each of the two cultivars. Comparisons were then performed between the datasets of the two cultivars to identify differentially-expressed genes common between the samples, and those that were unique to the two cultivars. A set of differentially-expressed genes with potential importance to post-harvest storage in blueberry were identi ed and these are discussed in relation to their roles in the retention of fruit rmness and to previous studies.

RNA sequencing and assembly
More than 240 million paired-end reads were generated each for both 'Bluecrop' and 'Legacy' using Illumina short-read sequencing. After quality control and trimming, approximately 230 million high quality reads remained for each cultivar (detailed in Additional File 1). Following de novo assembly using Bridger and redundancy reduction with CD-hit, 634,177 'Bluecrop' and 686,520 'Legacy' unigenes were resolved. A total of 233,740 and 249,618 protein coding sequences were identi ed for 'Bluecrop' and 'Legacy' respectively from the unigene set using GeneMarkS (GMST) and following Blastp analysis, a nal unigene set containing 37,711 'Bluecrop' and 37,093 'Legacy' unigenes was obtained (Table 1)

PCA analysis
A PCA analysis was performed which revealed that the maximum variability between samples was explained by tissue (berry vs. leaf) and cultivar differences (Fig. 1). Data from 'harvest' and '24 days postharvest' tissues clustered more closely to each other than either did to '21 days post-harvest', indicating that the 4℃ temperature at which fruits were sampled after 21 days had a greater effect on gene expression than the post-harvest period itself, and as such, the '21 days post-harvest' samples were excluded from further analysis and comparisons were made between the 'harvest' and '24 days postharvest' datasets only. genes associated with 'cell', 'cell part', 'intracellular part', 'organelle' and cellular processes for both cultivars (Fig. 2). The GO assignments were used to classify the functions of 'Bluecrop' and 'Legacy' transcripts into three categories, cellular component, molecular function and biological process. A few gene families with extremely high read counts were observed encoding repeat domain proteins associated with stress processes in plants. Pentatricopeptide, LRR (receptor like kinases containing extracellular leucine rich repeat motifs), and WD proteins in particular were all overrepresented.

Sequence annotation
Comparison with tetraploid blueberry transcriptome A total of 38,937 and 36,415 Blastn alignments were returned for the 'Legacy' and 'Bluecrop' unigene sets when they were queried against the tetraploid V. corymbosum 'Draper' [22] gene prediction set. Alignments were classi ed by their e-values and their associated alignment length ratios. Almost 80% of the alignments returned an e-value of 0 and an alignment length ratio greater than 0.9 for both cultivars (Additional le 2) indicating that the 'Bluecrop' and 'Legacy' unigene sets were representative of the gene complement in the reference sequence.
Analysis of differentially expressed genes  between harvest and 24-days post-harvest, unigene_22238 was down-regulated in 'Bluecrop' and upregulated in 'Legacy' between harvest and and 24-days post-harvest, unigene_20943 was upregulated in both 'Bluecrop' and 'Legacy' between harvest and and 24-days post-harvest, and unigene_3134 was down-regulated in both 'Bluecrop' and 'Legacy' between harvest and and 24-days post-harvest ( Table 2). Patterns of gene expression following analysis by qRT-PCR were in general agreement with the results from the RNAseq analysis and a linear regression analysis calculating Pearson's correlation factor revealed a strong correlation (R 2 = 0.98; p ≤ 0.001) between the log fold-change values observed in the RNAseq data and those observed in the qRT-PCR data (Fig. 7).

Discussion
Here we present analyses of gene expression changes that occurred in the fruit of two blueberry cultivars, 'Bluecrop' (known for poor rmness retention) and 'Legacy' (known for good rmness retention) during post-harvest storage. Whilst a genome reference sequence for tetraploid blueberry cultivar 'Draper' recently became available [22], which provides a reliable genomic resource for transcriptomic analysis, separate de novo transcriptome assemblies were performed here for 'Bluecrop' and 'Legacy' to avoid multiple-and/or mis-alignments within unigenes produced from a hybrid or reference assembly. Overall, the number of transcripts the 'Legacy' and 'Bluecrop' unigene sets resolved was 41,896 and 39,335 for respectively, comparable to those of the published genome that contained an average of 32,140 gene predictions per haplotypes, and the majority of the assembled transcripts returned highly similar, or identical matches to predicted genes within the 'Draper' reference genome [22].
The data presented here revealed that cell, cell part, intracellular, membrane-bound organelle part, cellular and metabolic processes, and catalytic activity were the most represented GO categories within the assembled unigene set. These categories were strongly represented in other studies of 'Bluecrop' [27,28] as well as in studies of the cultivars 'Northland' [24], 'O'Neal' [21] and 'Duke' [26]. The vast majority of factors affecting post-harvest quality are under strong genetic control [29] and 'Bluecrop' and 'Legacy' previously displayed signi cant differences in their response to prolonged periods of low-temperature post-harvest storage [7], resulting in very different post-storage texture pro les of the two cultivars. A recent study of Zhang et al., 2020 [26] used the cultivar 'Duke' which previously displayed texture characteristics very similar to 'Legacy' [7] at harvest, but which displayed a signi cant decline in texture during post-harvest storage. The analysis of differentially expressed genes in 'Duke' identi ed 1,167 upregulated, and 685 down-regulated genes following 30 days storage at 0 degrees C, which contrasted to the 2,370 down-regulated and 937 up-regulated, and 1,379 down-regulated and 542 up-regulated genes in 'Bluecrop' and 'Legacy' respectively following 21 days storage at 4 °C and three days at 18 °C in this study.
Here, a greater number of down and up regulated genes was observed in 'Bluecrop' than 'Legacy', suggesting 'Bluecrop' fruit is less well adapted to the physiological stresses associated with prolonged storage at 4 °C and subsequent shelf life (21 °C) than 'Legacy', as re ected in a greater decline in fruit quality and texture attributes observed in 'Bluecrop' over 'Legacy' [7]. Numerous DEGs associated within diverse cell, molecular, and biological pathways were identi ed as candidate unigenes with a potential role in rmness changes during storage in this study. Most of the DEGs observed between harvest and post-harvest in 'Bluecrop' and in 'Legacy' were predominantly categorized into amino acid metabolism, carbohydrate metabolism and cofactor and vitamins metabolism, followed by energy metabolism in 'Bluecrop' and by nucleotide metabolism in 'Legacy'. Furthermore, most of the DEGs identi ed between harvest and post-harvest between both cultivars were putatively involved in cell wall metabolism, composition of the skin wax layer, adaptation to abiotic stress and solute transport. Zhang et al., 2020 [26] found membrane lipid metabolism, proline, glutathione and avonoid metabolism, and hormone biosynthesis and signal transduction as the main pathways in which genes were differentially expressed in 'Duke', during storage at 0℃. Whilst many of the categories of DEGs observed were similar between the two studies, the differences observed in gene expression pro les between those reported here for 'Bluecrop' and 'Legacy', and those reported for 'Duke' most probably re ect the differences in the temperature storage protocols employed and most importantly differences in the point of sampling between the two studies. The post-harvest sampling conditions of [26] more closely resembled the sampling point in this investigation immediately following post-harvest storage when fruit was still at 4 °C. Zhang et al., 2020 [26], used a temperature of 0ºC, and whilst that has also been used in other blueberry studies to compare the fruit quality including rmness at different temperatures and days postharvest [10,11], it is not representative of temperatures in retail or home refrigeration settings.
Scrutiny of the differentially expressed transcripts in 'Bluecrop' and 'Legacy' was focussed on genes that could have played a putative role in the stress response during post-harvest cold-storage, and genes that could play a role in the retention of texture and fruit quality. Previous studies have shown that genes involved in determining the composition of the wax layer [bloom] in blueberry, as well as those involved in cell wall metabolism, adaptation to biotic or abiotic stress, calcium and solute transport could all play a role in maintaining good fruit quality during post-harvest storage, and from these classes, a set of 21 differentially-expressed candidate genes were characterised.

Epicuticular wax metabolism
The waxy layer representing the bloom of blueberries is a key protective mechanism against a range of abiotic stresses, including moisture loss and temperature uctuations, and has been shown to reduce deterioration in fruit quality during post-harvest storage [31]. Variations in the composition of the wax compounds in plant cuticles can affect their mechanical properties under storage [35][36][37] and these variations have been correlated with textural changes in blueberry [38], pepper and tomato [39][40][41] fruits during storage. Unigene 8556 shown to have high homology to wax ester synthase/diacylglycerol acyltransferase (WSD1), an enzyme which catalyses the synthesis of wax ester compounds in the stem, owers and leaves of Arabidopsis [42]. Acyltransferase genes have been reported to be strongly upregulated in Arabidopsis in response to abiotic/biotic stress [43] and in Euruca vesicaria seedlings in response to drought stress [44]. 'Bluecrop' softens more rapidly during storage than 'Legacy' [7], indicating a greater rate of moisture loss in 'Bluecrop' fruits during storage [12]. Here, unigene 8556 was highly down-regulated in 'Legacy', but up-regulated in 'Bluecrop', suggesting fruit of 'Bluecrop' may be triggering stress response pathways and increasing wax production in response to moisture loss during extended exposure to low temperatures and reduced % relative humidity experienced in post-harvest storage.
Cell wall metabolism under post-harvest storage Changes in cell wall structure during development and ripening occur due to degradation of cell wall constituents primarily through the disassembly of the cellulose-hemicellulose network [7,[45][46][47]. The cellulose and hemicellulose matrix comprise micro brils linked by hydrogen bonds, that increase the strength of the cell wall. In addition, a pectin matrix interlaces the cellulose-hemicellulose backbone conferring cell wall adhesion [25]. Thus, cell wall modi cations and disassembly form part of the regulation of fruit softening, which is a process that has been well characterised in blueberries [17,32,48], apple [49], peach [50], strawberries [51,52] and tomatoes [53,54].
Glycoside hydrolases have been reported to affect blueberry rmness during ripening and post-harvest stages [25,26] and their enzymatic activity depends on their speci city to individual carbohydrate components forming the cellulose (homopolymer of glucose) and hemicellulose (xylans, glucans, xyloglucans, callose, mannans and glucomannans) structures [14,55,56]. In this investigation, homologues of three glycoside hydrolases; GH17 [Unigene 3134), GH3 (Unigene 6494) and GH28 (Unigene 8703) were shown to be differentially-expressed during post-harvest cold-storage. The GH17 family encodes endoglucanases which degrade cellulose structures (56) and play a role in cell wall degradation in blueberry fruit and banana during ripening stages (25,57). Down-regulation of unigene 3134 (GH17) in 'Bluecrop' and 'Legacy' suggests a reduction in activity of this glycoside hydrolase during cold storage, supporting the ndings of [25,46] who reported reduced glycoside hydrolase activity in blueberry fruits during cold storage. Reduction in activity may be purely temperature related or possibly an adaptation mechanism triggered in both cultivars in response to post-harvest cold-storage [25,46]. GH3 is a xylosidase enzyme [56] with a role in degrading cell walls and contributing to softening in blueberry fruits [14,48], whilst GH28 showed high homology to a polygalacturonase enzyme which degrades pectin by hydrolysing the homogalacturonan backbone of the cell wall [58]. This family of enzymes has been shown to be upregulated during post-harvest storage and to a play a role in the softening of blueberry fruits and cell wall degradation [32]. Both GH3 (Unigene 6494) and GH28 (unigene 8703) were highly upregulated in 'Bluecrop' and down regulated in 'Legacy' during cold storage and thus may play a role in faster cell wall degradation in 'Bluecrop', leading to a greater degree of softening.
The role of expansin proteins in cell wall loosening has been well characterised in Arabidopsis [59], tomato [60,61], strawberry [62][63][64], peach [65] and kiwi fruits [66] where its activity is upregulated during the softening of ripening fruits [67]. Expansins initiate cell wall loosening and extension through the breakage of hydrogen bonds between cellulose and hemicellulose molecules, in particular xyloglucans [68,69], and their expression is regulated by cross-talk between many plant growth regulators including abscisic acid, indol-3-acetic acid, auxins, brassinosteroids, cytokines, ethylene [67]. Expansin activity has been shown to be enhanced by low pH, and low temperatures in an absence of an ethylene peak during storage [66]. Unigene 21016 was shown to be highly upregulated in 'Bluecrop' and down regulated in 'Legacy' during post-harvest storage, suggesting that expansins expression in 'Bluecrop' may contribute to the increased fruit softening observed by Giongo et a., 2013 [7], as has been reported in kiwi [66] and in tomato [61], where overexpression of expansins produced softer fruit and silencing was correlated to an increase of rmness and extended shelf life.

Abiotic stress and solute transport
Adaptation to abiotic stress in plants include the control of cell turgor, the induction of cell signalling pathways, an increase in respiration rates and deployment of protective mechanisms against Reactive Oxygen Species (ROS). The APETALA2/ETHYLENE RESPONSE FACTOR (AP2/ERF) is one of the mediators to plant external abiotic responses and developmental processes [70][71][72][73], and has been reported to regulate cold adaptation responses in the ower buds of 'Bluecrop' [74], and in cold-stored blueberry fruit of 'Duke' [26]. Unigene 7737 and Unigene 13947 display high homology to AP2/ERF and were found in differentially regulated clusters in this investigation, suggesting that members of this family of transcription factors may be repressed or induced under the same environmental conditions. AP2/ERF homologs in potato (CIP353) conferred low-temperature acclimation to tubers exposed to longterm cold-storage [72], whilst in tomato the overexpression of PTi4 encoding an AP2/ERF induced a family of expansin genes linked to roles in cell wall integrity [75]. AP2/ERF genes have also been reported to regulate the adaptation to osmotic, water stress and drought tolerance in Arabidopsis [76], and osmotic differences between protoplast and the apoplast have been shown to drive changes in turgor pressure in plant cells [77], changes which are linked to post-harvest softening in apple [78] and softening in blueberries [10,12,15]. 'Legacy' exhibits a greater retention of rmness than 'Bluecrop' during postharvest cold-storage [7] and unigene 13947 was upregulated in 'Legacy', and down-regulated in 'Bluecrop' suggesting that its expression may play a role in the maintenance of rmness in blueberry fruits during post-harvest cold-storage.
The mitochondria play an important function in buffering cytoplasmic calcium, with mitochondrial calcium uniporters (MCU) acting as calcium sensors and active channel protein for calcium uptake [30,79,80]. Mitochondrial uptake plays a role in ATP synthesis and regulating calcium concentration in the cytoplasm [Ca] cyt which is essential for regulating its role as a secondary messenger in response to abiotic stress. Moreover, [Ca] cyt plays a role in regulating fruit quality, speci cally by increasing the resilience of fruit to low temperature stress by enhancing the energy status of the cells and maintaining osmotic conditions [81]. Unigene 13589 displayed homology to mitochondrial calcium uniporters [MCU] and was down regulated during post-harvest storage for both cultivars, which may cause loss of mitochondrial function and reduced cellular calcium uptake during post-harvest cold-storage, contributing to fruit quality deterioration during storage.

Conclusions
Transcriptional changes during post-harvest storage in 'Bluecrop' and in 'Legacy' occur in genes with roles in catalytic activity primarily located in the cell, intracellular parts and membrane-bounded organelles. Additionally, differential expression was observed in genes involved in cell wall metabolism, synthesis of wax compounds and biotic and abiotic stress between harvest and post-harvest in the two cultivars studied. Identi cation of a set of differentially regulated candidate genes that may have a role in the observed differences in post-harvest fruit quality reported between 'Bluecrop' and 'Legacy' provides a strong foundation for future focussed studies of the role of these genes in relation to speci c physiological changes that occur during post-harvest cold-storage in blueberry fruits.

Plant material
Three plants each of two Northern highbush blueberry (V. corymbosum) cvs. 'Bluecrop' and 'Legacy', were selected for study as they displayed very different textural changes during post-harvest storage [7]. The plants were grown in substrate under tunnels on the Driscoll's test-plot at East Malling, UK and eight-yearold plants with a recorded crop load of more than six kg per year were used for study. Fruits were harvested by hand from both cultivars on the same date and stored at 4 ºC in a non-modi ed environment for 21 days following which they were removed from cold storage and maintained for a further three days at 21 °C. Berry tissue was collected immediately following harvest; at 21 days postharvest, immediately following removal from 4ºC cold-storage, and at 24 days post-harvest, following three days at 21ºC. Sampled berries were snap frozen in liquid nitrogen and stored at -80 ºC until use. Three biological replicates were sampled, each of which comprised a random selection of four berries from each sampling point for each cultivar. In addition, one sample of leaf tissue from each cultivar was collected comprising a random selection of four immature unexpanded leaves, snap frozen in liquid nitrogen and stored at -80 ºC until use.

RNA preparation, library construction, and RNA sequencing
Each berry and leaf sample was ground to a ne powder under liquid nitrogen using a mortar and pestle and the powder was collected in 2 ml sampling tubes (Eppendorf®) which were then stored at -80 ºC until RNA extraction was performed. Total RNA was extracted using the Spectrum™ Plant Total RNA kit (Sigma Aldrich) according to protocol A in the manufacturer's instructions. The concentration and purity of the resultant RNA was measured using a QIAxpert spectrophotometer (Qiagen). The integrity of the RNA was determined using a qubit 4.0 uorimeter (Thermo sher Scienti c) and samples with RNA integrity number (RIN) values above 6.3 were submitted for RNA-SEq. Library preparation was performed for the 18 fruit samples and two leaf samples using the NEB Next® ultra RNA library prep kit (Biolabs, Inc.) and 150-bp paired-end sequencing was performed by NovoGene Inc (China) using the HiSeq2500 platform to yield a minimum of 5.9 GB of data per sample.

De novo transcriptome assembly and functional annotation
Raw Illumina reads were cleaned and ltered using Trimmomatic version 0.36 (82), trimming sequence with an average quality below 15 in a four bp sliding window. Reads shorter than 36 bp in length following trimming were removed from further analysis. Read QC was performed using Fast QC version 0.11.7 and de novo assembly of transcripts was done separately for each cultivar using Bridger version 2014-12-01 [83] with default parameters. CD-HIT [84] was then used with a threshold = 0.99 on 'Bluecrop' and 'Legacy' transcripts separately to reduce redundancy, and GeneMarkS [85] (GMST) was used to predict protein coding regions of each of the transcript sets using default parameters. Align-back of the original reads to the nal unigenes revealed 33% of the raw reads that did not map to the unigene set. These reads were mapped back to the unigenes that were removed from the dataset following the analysis with GMST and unigenes to which more than 50 reads mapped were replaced in the nal unigene sets.

Analysis of Unigene expression
Trimmed reads of each sample and treatment were aligned to the 'Bluecrop' and 'Legacy' de novo transcriptome assemblies using Bowtie2 (version 2.3.4) [87] and the read counts for each sample were calculated using featureCounts (1.6.0) [88]. Differential expression analysis was carried out using the Limma empirical Bayes analysis pipeline [89] and voom [90], which estimates the mean variance trend of the log counts to predict the variance and to generate a precision weight to be incorporated in the linear model. Principal components analysis was performed using the voom-normalized data of the putative homologous genes from the 'Bluecrop' and 'Legacy' datasets identi ed using the reciprocal best blast hit (rbbh) method. Figures were plotted using the matplotlib and seaborn Python packages [91,92] and VennDiagram package from R development Core Team [93]. Unigene annotation was performed using Interproscan, Blastp against the Uniprot Viridiplantae uniref 100 database (using a cut-off of e-value > 1*10^-5), and assignment of gene ontology (GO) terms was performed with EggNOG (94). The distribution of GO functional classi cations for the Unigenes was plotted with WEGO 2.0 using default parameters [95].
A double cutoff on both p-value and fold change was used to select differentially expressed genes and a p-value < 0.001 and a minimum logFC > 1 were used to classify genes as differentially expressed. Three differential expression analyses were performed; one within each cultivar, and one between the 'Bluecrop' and 'Legacy' cultivars. The comparison performed between the 'Bluecrop' and 'Legacy' cultivars was carried out using the results of the rbbh analysis, with the 27,919 genes shared between the two cultivars. The signi cantly up-and down-regulated genes identi ed between 'Bluecrop' and 'Legacy' were grouped into four clusters using matplotlib in Python [91] representing genes up-regulated between harvest and post-harvest for both cultivars, genes down-regulated between harvest and post-harvest for both cultivars, genes up-regulated in 'Legacy' and down-regulated in 'Bluecrop' between harvest and post-harvest, and nally genes down-regulated in 'Legacy' and up-regulated in 'Bluecrop' between harvest and post-harvest.
The functional annotations assigned to the differentially expressed unigenes were then scrutinised to identify differentially-expressed candidate unigenes with a potential role in regulating textural changes during post-harvest storage. Additionally, the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (https://www.genome.jp/kegg/tool/map_pathway.html) was used to search and reconstruct the pathways of the differentially expressed genes from harvest to post-harvest in 'Bluecrop' and in 'Legacy'. The histogram of KEGG pathways was plotted using Excel.
The normalized expression levels of the candidate unigenes at harvest and post-harvest were used to plot a heat map with the heatmap3 package in R development Core Team [93].

Comparison of unigenes with tetraploid blueberry gene predictions
The 'Legacy' and 'Bluecrop' assembled unigenes were compared to the nucleotide sequence of the published tetraploid blueberry genome [22] with Blastn (BLAST + 2.7.1) using default parameters.
Quantitative real-time PCR validation of RNAseq data cDNA was synthesised using the Omniscript reverse transcription kit (Qiagen, 205111) according to the manufacturer's recommendations with 500 ng of high-quality RNA used in 20 µl reactions. In total, four candidate genes with signi cant levels of differential expression in the RNAseq analysis and with relevant functions related to rmness changes during storage for both cultivars were selected for qRT-PCR. These were: unigene_8556 (O-Acytransferase; down regulated in 'Legacy' and up regulated in 'Bluecrop'), unigene_22338 (Cytochome P450; up regulated in 'Legacy' and down regulated in 'Bluecrop'), unigene_20943 (Phloem protein like 2; up regulated in 'Legacy and in 'Bluecrop') and unigene_3134 (Glycoside hydrolase 17; down regulated in 'Legacy' and in 'Bluecrop'). In addition to this, eight stably expressed blueberry genes [96,97] were selected for reference gene validation. These included: TIP41-like protein (TIP41); ubiquitin-conjugate enzyme (PEX4); hypothetical (Vc4g26410); unknown protein (Vc4g16320); glyceraldehyde 3-phosphate dehydrogenase (GAPDH); ubiquitin conjugating enzyme (UBC28); pentatricopeptide repeat-containing protein (PPR); and RNA helicase-like (RH8), which were tested in order to validate their expression stability under the conditions in this experiment. Validation of gene expression stability was carried out using the GeNorm software platform (https://genorm.cmgg.be/) [98,99] and UBC28 and RH8, were selected as the most stable and were used as references for gene expression analyses.
Real time qRT-PCR primers ( Table 3) for candidate genes were designed using the Primer3 web tool (http://primer3.ut.ee/), checked for primer dimer formation and secondary structures by OligoEvaluator™ tool and veri ed in silico using the BLAST function of the Genome Database for Vaccinium spp (https://www.vaccinium.org/crop/blueberry) against the 'Bluecrop' and 'Legacy' unigene sets developed in this investigation and the Vaccinium predicted gene set (22). All qRT-PCR reactions were performed on a Bioer LineGene 9600 qRT-PCR system (Alpha laboratories, UK) using SYBR green as a detector dye. The  Availability of data and materials The datasets generated and analysed during the current study are available in the SRA repository under project number XXXXXXXX

Competing interests
The authors declare that they have no competing interests Funding MCC acknowledges a PhD grant from University of Greenwich Vice Chancellor Scholarship (VCS-ES-08-17) and also acknowledges funding from Driscoll's Genetics Limited which covered the costs associated with eld plots and molecular biology consumables in this investigation.
Authors' contributions MCC carried out experimental work, data analysis and data interpretation, and authored the manuscript; DJS conceived the study, interpreted data and authored the manuscript; NŠ carried out experimental work, interpreted data and co-authored the manuscript; RJC conceived the study, interpreted data and authored the manuscript; MM conceived the study, analysed and interpreted data, and co-authored the manuscript. All authors read and approved the nal manuscript.    KEGG pathway annotation of DEGs in (a) 'Bluecrop' from harvest to 24 dph, and (b) 'Legacy' from harvest to 24 dph. The x-axis shows the number of unigenes annotated and the y-axis details the KEGG pathway category. A) Cellular processes, B) Environmental information processing, C) Genetic information, D) Metabolism and E) Organismal systems.