Plant collection
Samples of seven Cestrum species (C. bracteatum Link & Otto {formerly named of C. amictum}, C. corymbosum Schltdl., C. intermedium Sendtn., C. axillare Vell. {formerly named of C. laevigatum}, C. mariquitense Kunth, C. nocturnum L., and C. strigilatum) were collected from natural environments of different South and Southeastern Brazilian regions (São Paulo and Paraná states) and cultivated in the greenhouse of the Laboratory of Cytogenetics and Plant Diversity, State University of Londrina, Paraná, Brazil (Online resource 1). The collection consisted of at least three individuals of each species, and the vouchers were kept in the FUEL herbarium, at the State University of Londrina.
DNA extraction, genome sequencing and assembling
High-molecular-weight DNA (with about 40Kb fragments) of C. strigilatum was isolated from young leaves using the Nuclear Isolation Buffer (NIB) method [17]. The sample was used for low-coverage sequencing by an Illumina HiSeq2000 system (Novogene Company). The input files containing 9,227,692 reads with a 150 bp length were assembled using the Repeat Explorer pipeline with default [18, 19] ( https://repeatexplorer-elixir.cerit-sc.cz), corresponding to ~0.1× of coverage. Data of C. elegans genome sequencing was retrieved from the public database of NCBI (SRX1951472), corresponding to ~0.3× of coverage. Datasets were filtered by quality with 90% of bases equal to or above the cut-off value of 10. To expand searches for repetitive DNA sequences, input files containing 9,227,692 reads with a 150 bp length (C. strigilatum) and 22,134,326 reads with a 197 bp length (C. elegans) were also assembled with the SPAdes v3.6.2 program using K-mers 31, 51, 71, in a local server.
DNA C-value estimates
Measurements of Nuclear DNA amounts were performed with young leaves using 1 mL of cold LB01 buffer plus 1 mg/mL propidium iodide [20]. Analyses were performed on a BD ACCURI C6 flow cytometer, in three independent estimations on different days. Pisum sativum L. ‘Ctirad’ (2C=9.09 pg), was used as standard [20]. Measurements of at least 30,000 nuclei were taken in each reading cycle (three times). The 2C values were calculated as sample peak mean / standard peak means × 2C DNA amount of standard (pg).
Repetitive fraction evaluation
Two approaches were used to evaluate the proportion of repetitive DNA families in these two datasets. Sequences were contrasted in the pipeline implemented on RepeatExplorer and also using local Blastn/x analyses against databases containing conserved domain sequences available on RepBase (http://www.girinst.org/censor/), GypsyDB (http://gydb.org/index.php/MainPage) and RexDB (http://repeatexplorer.org/), totaling 283,676 protein sequences. To evaluate the proportion of rDNA sequences, a second database composed of 1652 sequences from 35S and 5S sequences from various organisms obtained on NCBI (http://www.ncbi.nlm.nih.gov) was used, keeping as a criterion max_target_seqs 1 as parameter, and the others on default in a tabular output format.
To gain an overview of satellite DNA families on the large Cestrum genomes, sequences of C. strigilatum and C. elegans were subjected to the TAREAN (TAndem REpeat ANalyzer) tool, on RepeatExplorer [18, 19]. In order to expand the scope of the analysis, the output files produced with SPAdes assembler, sequences of both genomes, were used in the TRF script [21]. Output files were processed with the TRF-filter script designed by us, G-numeric-1.12.35, as well as with filtering commands based on bash scripts in the Linux environment. Data were compared to optimize the search for satellite, microsatellite, and transposable element sequences. Probable satellite DNA monomers were used in a comparison “all against all”, and checked with the Dotter Version 4.44.1 [22]. The SSR frequency was estimated using SSRIT script (http://archive.gramene.org/db/markers/ssrtool), the SSRIT output file was filtered with SSR_Estimates.sh script (designed by us), and the motifs (from two to six nucleotides, nts) were tabulated and organized to compare the 20 most common motifs in the two genomes used in this work.
Cestrum elegans data were used for comparisons in the bioinformatics analyses, but this species does not occur in Brazil. Therefore, only C. strigilatum was used for sequencing and cytomolecular analysis, because, in addition to being a species easily found in several Brazilian biomes, it exhibits the highest heterochromatin amount between the studied species.
Probe labeling and FISH
PCR were performed using a mix composed of 2 mM MgCl2, 0.4 µM of primers, dNTP containing dGTP (0.1 mM), dCTP (0.1 mM), dATP (0.1 mM), dTTP (0.07 mM), and Cy3-dUTP (0.03 mM), ~10 ng of DNA template, 1.25 U of Taq polymerase and ultrapure water to complete 25 µL. The PCR conditions were 3 min at 94°C, followed by 30 cycles of 1 min at 94°C, 30 s at 60 ºC, and 1 min at 72°C, and then 10 min at 72°C. Amplicons were labelled with DIG and BIO, using the nick translation kit (Digoxigenin NT Labeling Kit and Biotin 16 NT Labeling Kit – Jena Bioscience). The oligos CsSat1 and CsSat72 were designed with 5 biotin modifications (ThermoFisher Scientific), CsSat74 and CsSat49 were labelled by PCR using digoxigenin-dUTP.
The retrotransposon probes were obtained by PCR using specific primers for a conserved stretch of reverse transcriptase of Sire/Copia and Athila/Tat/Gypsy elemments. A standard PCR [5 U µL–1 Taq polymerase (0.5 µL), 10× buffer (2.5 µL), 50 mm MgCl2 (1.5 µL), 10 mm dNTP (1 µL), 5 mm primers (2 µL each), and H2O up to a final volume of 25 µL] was used in the following conditions: 94 ºC for 2 min, 30 cycles of 94 ºC for 40 s, 59 ºC for 40 s and 72 ºC for 1 min, and a final extension of 72 ºC for 10 min. Reactions were tested using electrophoresis in an agarose gel at 3 V cm–1 and stained with ethidium bromide. Amplicons were used in a second PCR to produce probes, which were labeled using 0.2 mM dNTP, containing dGTP (25%), dCTP (25%), dATP (25%), dTTP (17.5%) and bio-dUTP (7.5%) or Cy3-dUTP (7.5%).
Fluorescence in situ hybridization (FISH)
For cytogenetic analysis, slides were prepared from root tips pretreated with 0.1% colchicine for 6 h, fixed in Farmer solution (ethanol/acetic acid 3:1, v:v) for at least 2 h, and stored at -20°C. Roots were also collected and directly fixed in order to obtain mitotic stages. To obtain Cold Sensitive Regions (CSR), root tips were pretreated in cold water (~0 ºC) for 26h and fixed in Farmer solution [23]. Samples were softened in 2% cellulase plus 20% pectinase (w:v), at 37°C and then dissected in a drop of 60% acetic acid, and subsequently squashed. Coverslips were removed in liquid nitrogen, and the preparations were used for in situ hybridization tests.
For FISH, the probes labeled with biotin and digoxigenin were added on the slides prepared with 30 µL of a denatured mix containing 100% formamide (15 µL), 50% polyethylene glycol (6 µL), 20× SSC (3 µL), 100 ng calf thymus DNA (1 µL), 10% SDS (1 µL), and 100 ng probe (~4 µL), reaching > 70% stringency. Both slide and mix were denaturated/hybridized at 95°C, 50°C, and 38°C for 10 min in a thermal cycler, and then at 37°C overnight in a humidified chamber. Post-hybridization washes were carried out in 2× SSC and 4× SSC/0.2% Tween 20. Probes were detected using avidin–fluorescein isothiocyanate (FITC) or anti-digoxigenin-rhodamine conjugate and the slides were mounted with 25 µL of a solution composed of glycerol (90%), DABCO (2.3%), 20 mM Tris-HCl, pH 8.0 (2%), 2.5 mM MgCl2 (4%), and distilled water (1.7%), plus 1 µL of 2 µg/mL DAPI. Sequential hybridizations were performed on slides washed in baths of 4× SSC/0.2% Tween 20 (twice), 2× SSC, fixed in ethanol/acetic acid 3:1 (v:v), and dehydrated with absolute alcohol.
Image acquisition
The slides were analyzed in three replicates for at least 10 cells. All chromosome images were acquired using a Leica DM 4500B microscope, which is equipped with a DFC 300FX camera, and overlapped with blue for DAPI, greenish-yellow for FITC, and red for Cy3, using Leica IM50 4.0 software. The images were optimized for contrast and brightness using GIMP 2.8 Image Editor.