Ethical statement
The blood sample used in this study was taken from the giant panda “Jingjing” (Stud#598, Female), which is the original sample of the reference genome of giant panda (ASM200744v1 and ASM200744v2). The protocols used for blood sampling were conducted under the guidelines of the Regulations for the Administration of Affairs Concerning Experimental Animals (Ministry of Science and Technology, China, 2013), and approved by the Institutional Animal Care and Use Committee of Chengdu Research Base of Giant Panda Breeding (IACUC, NO. 2018003).
The Construction of BAC Library
The genomic BAC library of giant panda was constructed following the previous methods (Shi et al., 2011). Briefly, we first separated cells from blood using PBS buffer. Then, we embedded isolations in 1% low-melting temperature agarose (Sigma Aldrich Co.MO, USA) at the concentration of 1×108 cells per milliliter to form plugs for subsequent genomic DNA isolation. We then subjected DNA plugs to brief electrophoresis to remove small DNA fragments after 48 h digestion with proteinase K. The gel plug was then partially digested with HindIII (1 U/μL) at 37°C for 7 min. Fragmented DNAs were separated on 1% agarose gels using pulsed-field gel electrophoresis (PFGE) (CHEF Mapper, Bio-Rad, USA). DNA fragments were selected within the range of 110 to 280 kb on 1% CHEF gel. The size-selected DNA fragments were then collected and ligated into the BAC vector (pIndigoBAC536-S) and desalted in 0.1 M glucose. The ligation products were electroporated into Escherichia coli (E. coli) strain DH10B T1 Phage-Resistant (Invitrogen, USA). The transformed E. coli cells were picked and cultured on LB plates with 12.5 mg/L chloramphenicol, 80 mg/L X-gal, and 100 mg/L IPTG at 37°C overnight.
The characterization of BAC inserts
To estimate the insert size and the proportion of self-ligted BAC clones, 250 BAC clones were randomly picked and inoculated in the 2 ml LB plates for growth. Individual plasmid DNA was then extracted, digested with I-Sce I, and the insert size of digested products was then detected in 1% agarose gel PFGE at 6v/cm with 5~15s switch time. The probability of detecting a specific locus or gene of our BAC library was estimated according to the formula of , where P is the probability, N is the number of clones, is the average insert size of clones, and GS is the genome size.
Long-read sequencing of BAC clones
To further verify the BAC library, we randomly chose 20 positive clones for long reads sequencing. Those clones were picked and cultured in LB medium with 12.5 ug/mL chloramphenicol at 37°C overnight. We then extracted their plasmid DNA using QIAGEN Large-Construct Kit according to the manufacturer’s instructions. The plasmid DNA degradation and contamination were monitored on 1% agarose gels, and the concentration was measured by Qubit® DNA Assay Kit in Qubit® 3.0 fluorometer (Invitrogen, USA). The plasmid DNA of 20 samples were then submit to Novogene Corporation Inc. Single molecule real-time (SMRT) libraries were constructed for PacBio genome sequencing following the standard protocols of Pacific Biosciences company. Briefly, high molecule genomic DNA was sheared to ~20 kb targeted size, followed by damage repair and end repair, blunt-end adaptor ligation, and size selection. Finally, the libraries were sequenced on the PacBio Sequel platforms.
Long reads assembly and analysis
Raw reads were generated using CCS software (V6.0.0) with -min-length 2000 and -min-passes 2. Reads containing the sequences of E. coli DH10B were aligned and trimmed using the blastn function of BLAST (V2.9.0). After which, we performed genome assembly for cleaned reads using Canu (V2.0) with genomeSize=150Kb. To get clean sequences of BAC clones, the assembled FASTA sequences were further trimmed to remove the pIndigoBAC536-S vector sequences using the blastn function embedded in BLAST (V2.9.0). Repeat sequences for each clone were identified by RepeatMasker (V4.0.6). To further evaluate the 20 BAC sequences, we aligned and compared them with the reference genome of giant panda (assembly ASM200744v2) using Mummer (V4.0.0 beta2). All the plots in this study were generated by ggplot2 package (V3.3.3) of R (V4.0.4).