Samples
The eastern three-lined skink, B. duperreyi, is a medium-sized (80 mm snout–vent length) lizard widely distributed through south eastern Australia, from the coast to montane cool-climate habitats [84]. Adult individuals (n = 76) were captured by hand at Piccadilly Circus (35°21'37.59"S, 148°48'13.39"E, 1246 m a.s.l.) in Namadgi National Park, 40 km west of Canberra in the Australian Capital Territory, and from Anglesea (38°23'26.76"S, 144°12'52.29"E, 40 m a.s.l.) in Victoria (Fig. 2, Additional File 1: Table S4). The Anglesea population is a distinct mitochondrial lineage from the Piccadilly Circus lineage (ca 3 Myr divergent, [85]). Snout-vent length was measured with Vernier callipers (+/- 1 mm) and males identified by hemipenal eversion [86] and breeding colouration. A representative male and female from Piccadilly Circus (focal individuals) were transported to the University of Canberra animal house where each was euthanised by intraperitoneal injection of sodium pentobarbitone (100-150 µg/g body weight), dissected, and phenotypic sex confirmed by examination of the gonads. Tail tips (4-5 mm) were removed with a sterile blade, a portion stored in 95% ethanol at -20˚C, and a portion set aside for cell culture. Tail-snips were removed also from an additional 24 males and 24 females from Piccadilly Circus and 10 males and 10 females from Anglesea and stored in 95% ethanol at -20˚C. All animals were released to the capture sites. These are referred to as the validation animals. A portion from three males and three females from Piccadilly Circus were set aside for cell culture and karyotyping.
For cell culture, tail tips were immediately transferred to 10 ml of collection medium (Gibco Dulbecco's Modified Eagle Medium; Thermo Fisher Australia Pty Ltd, Scoresby, Victoria, Australia) with 2.5 µg/ml of Antibiotic Antimycotic Solution (Sigma Chemical Company, St. Louis, USA) and incubated at room temperature for 24 h [87] before the metaphase chromosomes preparation (see validation of phenotypic sex identification in methods).
DNA extraction, sequencing, and in silico whole genome subtraction
DNA was extracted from fresh liver samples of the two focal animals and from the tail snips of the 60 validation animals using the Gentra Puregene Tissue Kit (QIAGEN, Australia) following manufacturer protocols. DNA suspensions were assessed for purity using a NanoDrop 1000 spectrophotometer (NanoDrop Technologies, Wilmington, 19810, USA) and quantified using Qubit 2.0 fluorometer (Invitrogen, Life technologies, Sydney, NSW, Australia). Library preparation and sequencing were performed at the Biomolecular Resource Facility at the Australian National University (Canberra, ACT) using the Illumina HiSeq 2000 platform yielding 150bp paired end reads.
Reads from the focal male and the focal female were analysed independently as follows (Fig. 3). First, overlapping read pairs were combined into fragments then decomposed into k-mers of 27 bp using Jellyfish 2.0 [88]. Unique k-mers were counted, again using Jellyfish 2.0 and k-mers in common between the male and female sets were removed from the male set. This yielded a (subtracted) k-mer set that was enriched for Y chromosome sequence. Strictly, the subtracted k-mer set contains k-mers that are from Y chromosome sequence admixed with k-mers representing polymorphic differences between the female X chromosomes and the male X chromosome. K-mers in the subtraction with a count less than 2 for males and 5 for females were considered to represent sequencing errors and were removed from the analysis. This decision was based on examination of the k-mer spectra, identifying the minima immediately to the right of the peak arising from presumed read errors. This is not a critical decision. Select it too high, and the risk is that some important k-mers will be eliminated from the re-assembly of Y enriched kmers. Select it too low, and the cost is inclusion of low count kmers from reads containing errors and a greater noise to signal ratio. This does not affect the outcome, just the computational resources required for subtraction and reassembly.
The remaining Y enriched k-mers were then reassembled into contigs using an inchworm assembler (kassemble.cgi, https://doi.org/10.5061/dryad.pvmcvdnj1) with stringent extension criteria. Briefly, the assembler initially took a focal k-mer at random and searched for other k-mers that matched exactly k-1 bp of the focal k-mer. If this second k-mer was unique, then the focal k-mer was extended by one bp, and the process was repeated. If the k-mer was not unique, then the extension process was terminated. The extension occurred to both the left and the right, yielding relatively short contigs (up to ca 1400 bp) that contain sequence unique to the male individual.
PCR validation
To validate the sex specificity of each of the contigs and remove false positives derived from autosomal and X chromosome polymorphisms, we designed primers for each contig using Primer 3 [89] implemented in Geneious [90] (version R8). We then applied these presence/absence PCR tests in the validation animals using the following conditions. Each reaction contained 1x My Taq HS Red mix (Bioline), 4 µM each primer and 25 ng of genomic DNA. The PCR cycling conditions used an initial touchdown phase to increase the specificity of amplification: denaturing at 95˚C, annealing temperature stepping down from 70˚C by 0.5˚C for 10 cycles, extension at 72˚C. This was followed by 30 cycles at 65˚C annealing and 72˚C extension.
The PCR screening process was conducted in three stages. To confirm that the subtraction pipeline had successfully identified a presence/absence polymorphism in the two focal individuals, we first screened those two individuals to confirm presence of an amplified fragment in the male and the absence of an amplified fragment in the female. We then screened a panel of an additional 4 male and 4 female individuals for putative sex-linked markers showing a male-specific positive pattern. In a third step, we screened those putative markers on a further 20 males and 20 females from Piccadilly Circus. At each of the stages, the loci that did not appear as sex specific were eliminated as candidate sex markers. The probability of an autosomal or X chromosome polymorphism being present in the focal male, 4 males and 20 additional males, and absent in the focal female, 4 females and 20 additional females, is sufficiently low (≤ 0.2524, maximal for autosomal or X allele frequency = 0.5) to eliminate false positives, despite the error rate compounding over multiple markers. Thus, male specific markers that survive the validation process are Y-specific markers.
To confirm the amplification of the desired sequence, PCR products for all 7 putative Y-loci were visually assessed using gel electrophoresis and then Sanger sequenced in a single direction, using the forward primer, on an AB 3730xl DNA Analyzer at the Biomolecular Research Facility, Australian National University, Canberra, Australia. We sequenced 4 male individuals from Piccadilly Circus (Namadgi National Park, ACT) and 4 male individuals from Anglesea (Victoria).
Validation of phenotypic sex identification
The phenotypic sex of each of the karyotyped animals was confirmed by gross examination of gonads followed by histological examination. Dissected gonads were dehydrated through graduations of ethanol (70%, 90%, 100%) and two changes of xylene for 45 min each, before being embedded in paraffin wax, and sectioned 5 to 6 μm using a Leica Rotary Microtome (Leica Microsystems Pty Ltd, Waverley, Australia). Slides were stained with haematoxylin and eosin, with a staining time of 2-3 min in haematoxylin, and 10 dips in 0.25% eosin in 80% ethanol, before being mounted in Depex medium (BDH Laboratory Supplied, England). Gonads were characterized according to standard cellular structures [91, 92].
Karyotyping was carried out by examining metaphase chromosomes prepared from fibroblast cell lines of tail tissues as outlined by Ezaz et al [10] with minor modifications. Briefly, three replicate subsamples for each individual were made using sterile scalpel blade. The individual subsamples were transferred to separate T25 culture flasks with 1.5 ml Amnio-Max medium (Thermo Fisher Australia Pty Ltd, Scoresby, Victoria, Australia) and 0.25 µg/ml Antibiotic Antimycotic Solution (Sigma Chemical Company, St. Louis, USA). The cells were allowed to propagate at 28˚C and 5% CO2. At approximately 80% confluency, cells were split into three T25 flasks for a further 3 to 4 passages before they were harvested by adding colcemid (0.05 µg/mL) for 3.5 hours and treated with hypotonic solution (KCl, 0.075 mM). Slides were fixed with an ice-cold (ca 4˚C) 3:1 mixture of methanol and acetic acid. The cell suspension was dropped on to slides, air dried and frozen at -80°C until use. For DAPI (4’,6-diamidino-2-phenylindole) staining, each slide was mounted with anti-fade medium Vectashield (Vector Laboratories Inc., Burlingame, CA, USA) containing 1.5 mg/ml DAPI.
Contig sequence analysis
To discover homologies of the male-specific contigs and identify any partial gene sequences that may exist, we used BLASTN to search each contig against representative reptilian and avian genomes available in Ensembl, Release 99 (Anolis carolinensis, Crocodylus porosus, Gallus gallus , Pelodiscus sinensis, Podarcis muralis, Pogona vitticeps, Pseudonaja textilis, Notechis scutatus, Varanus komodoensis, Sphenodon punctatus) with a minimum E-value of 0.000001 for reported alignments and a filter for low complexity regions. We used the same cut-off and filter to search the non-redundant database at the NCBI (https://blast.ncbi.nlm.nih.gov). The Dfam database [61] was used to search for known transposable elements.