Characterization of the fragmented mitochondrial genome of domestic pig louse Haematopinus suis (Insecta: Haematopinidae) from China

The domestic pig louse Haematopinus suis (Linnaeus, 1758) (Phthiraptera: Anoplura) is a common ectoparasite of domestic pigs, which can act as a vector of various infectious disease agents. Despite its significance, the molecular genetics, biology and systematics of H. suis from China have not been studied in detail. In the present study, the entire mitochondrial (mt) genome of H. suis isolate from China was sequenced and compared with that of H. suis isolate from Australia. We identified 37 mt genes located on nine circular mt minichromosomes, 2.9 kb-4.2 kb in size, each containing 2-8 genes and one large non-coding region (NCR) (1,957 bp-2,226 bp). The number of minichromosomes, gene content, and gene order in H. suis isolates from China and Australia are identical. Total sequence identity across coding regions was 96.3% between H. suis isolates from China and Australia. For the 13 protein-coding genes, sequence differences ranged from 2.8%-6.5% consistent nucleotides with amino acids. Our result is H. suis isolates from China and Australia being the same H. suis species. The present study determined the entire mt genome of H. suis from China, providing additional genetic markers for studying the molecular genetics, biology and systematics of domestic pig louse.

Metazoan mitochondrial (mt) genomes are usually circular DNA molecules (13-20 kb) with 36-37 genes that contain 12-13 protein-coding genes, two rRNA genes and 22 tRNA genes (Wolstenholme, 1992;Lavrov, 2007). Sucking lice have an unusual, fragmented mt genome organization. Fragmentation of the mt genome was first found in the human body louse, Pediculus humanus humanus Linnaeus, 1758 (Anoplura: Pediculidae) (Shao et al., 2009). To date, the mt genomes of 21 sucking lice species (12 complete mt genomes and 9 incomplete mt genomes) have been sequenced, all are extensively fragmented with different numbers of minichromosomes (Shao et al., 2012;Jiang et al., 2013;Dong et al., 2014;Fu et al., 2022). In the genus Haematopinus, the complete mt genomes are available for only four species, H. apri (Goureau, 1866), H. asini (Linnaeus, 1758), H. tuberculatus (Burmeister, 1839), and H. suis (Linnaeus, 1758) Fu et al., 2022;Nie et al., 2022). The mt genome of H. suis from Australia has been sequenced, but H. suis isolated from different regions may display differences. A previous study indicated that a pseudo-trnV gene has been found in H. suis and H. apri from Australia (Jiang et al., 2013), however, this pseudo-trnV gene was not been reported in H. apri from China (Nie et al., 2022). This finding raised the possibility that H. suis populations from different geographical regions may have significant nucleotide sequence differences in their mt genomes.
The aims of the present study were: (i) to characterise the mt genome of H. suis from China, (ii) to compare this mt genome with that of H. suis from Australia, and (iii) to test the hypothesis that H. suis from different geographical regions have significant nucleotide sequence differences in their mt genomes.

Sample collection and DNA extraction
Adult ectoparasites were collected from a domestic pig in Sichuan province, China. Lice were identified as H. suis based on morphological features and host information (Kim & Ludwig 1978). Individual louse was washed five times in physiological saline solution and stored in 100% (v/v) ethanol at -40°C until DNA extraction. Total genomic DNA was extracted using the Promega kit (Madison, USA) per manufacturer's instructions. The domestic pig louse was further confirmed by PCR primers analysis as per (Fu et al., 2020a(Fu et al., , 2020b of partial mt cox1 and rrnS sequences. nBlast and analysis showed 97.3% and 97.4% similarity to H. suis from Australia (GenBank accession no. HM241908 and KC814610), respectively.

Sequencing, assembling and annotation
Concentration of total DNA was tested using the Qubit system (Thermo Fisher Scientific, Waltham, MA, USA) to construct genomic library (with 350 bp inserts). The raw data (paired-end reads) was sequenced with the Illumina HiSeq 2500 platform. Reads with adapter, repetitive ''N-'' bases and low quality (Phred quality\5) were filtered to obtain clean data based on FastQC 0.11.9 and Skewer v0.2.2 (Jiang et al., 2014). Using the partial cox1 and rrnS sequences mentioned before as references, clean reads were used to assemble the mitogenome of H. suis in Geneious Prime v2022.1.11 (Kearse et al., 2012). Minichromosomes of H. suis were individually assembled following (Fu et al., 2020a(Fu et al., , 2020b. Protein-coding genes (PCGs) and rRNAs (rrnL and rrnS) annotation within each minichromosome were predicted by alignment to other Haematopinus species. All 22 tRNAs were found by ARWEN and tRNAscan-SE (Laslett & Canbäck 2008;Lowe & Chan 2016).

Verification of mitochondrial minichromosomes
The size and organization of each mt minichromosome was amplified and verified by PCR (Fig. S1) using specific primers (Table S1). Forward and reverse primers were designed to be close to each other with a small gap (10-90 bp). Each minichromosome was amplified by PCR using these primers in full or close to full size if it had a circular organization. As previously mentioned, these positive amplicons were also sequenced using the Illumina HiSeq2500 platform. We re-assembled the large non-coding region (NCR) of each mt minichromosome using these acquired sequences in accordance with the same procedure in order to get full-length and exact sequences of the NCR of all minichromosomes.

Results and discussion
General features of minichromosomes organization A total of 19,867,822 clean double ended reads were obtained after quality control. The complete mt genome of H. suis (China) was assembled with 37 genes in nine minichromosomes ( Fig. 1; Table 1), consistent with previous studies (Jiang et al., 2013;Song et al., 2014). The nucleotide sequences of each minichromosome across H. suis were uploaded into the GenBank database under the accession numbers ON585586-ON585594. Each minichromosome was 3.0-4.8 kb in length, containing 2-8 genes and a large NCR (Table 1). The size of coding regions varied the size from 789 to 2,668 bp, similar to previous reports of pig lice of 786-2,669 bp (Jiang et al., 2013). NCRs ranged from 2,226 bp (trnL 1 -rrnL minichromosome) to 1,957 bp (trnQ-nad1-trnT-trnG-nad3-trnW) (Table 1), respectively. The length of the longest NCR from China was 144 bp shorter than that of Australia (Jiang et al., 2013). Except for trnQ-nad1-trnT gene cluster, all other genes were transcribed in the same direction relative to the NCR, consistent with other Haematopinus species.

Non-coding regions
We obtained complete NCR sequences of all nine mt minichromosomes in H. suis, which range from 1,957 bp (trnQ-nad1-trnT-trnG-nad3-trnW minichromosome) to 2,226 bp (trnL 1 -rrnL minichromosome) ( Table 1; Fig. 1), with 82.2-95.1% pairwise identity to each other. NCR sequences showed conserved regions in each chromosome (Fig. 2). A 62 bp threetime repeat was found in the NCR of each minichromosome except for trnQ-nad1-trnT-trnG-nad3-trnW minichromosome in which only a partial 32 bp sequence was repeated twice. A highly conserved sequence (149 bp, AT-rich, 70.5% A?T) was usually present upstream of the 5 0 -end of the coding region (Fig. 2). Highly conserved NCRs are also present in other blood-sucking lice (Jiang et al., 2013;Fu et al., 2020aFu et al., , 2020bShao et al., 2017) and fragmented chewing lice genomes (Sweet et al., 2020(Sweet et al., , 2022. Previous studies found a GC-rich region downstream of the 3 0 -end of the coding region of some sucking lice (Shao et al., 2012;Jiang et al., 2013), however, in the present study, this region was replaced by a AT-rich region 190 bp in length (63.7% A?T).

Comparative mt genomic analyses of H. suis from China and Australia
The H. suis mt genome sequence from China was 7 bp shorter than that of H. suis from Australia. Comparisons of nucleotide (NT) and amino acid (AA) sequences are given in Table 2. Nucleotide homology of the complete coding-regions between China and Australian isolates was 96.3%, and the homology of the PCGs was 96% (NT) and 97.9% (AA). Comparisons of gene level homology revealed differences of 2.8-5.6% (NT) and 0.5-5.3% (AA) ( Table 2). The gene atp6 (5.6%) was most conserved while nad4L was the least (2.8%). Divergence of the rrnL and rrnS genes was 5.5% and 6.5%, while that of tRNAs was only 3.0%. The difference between total codingregions H. suis from China and Australia isolates was 2.8-6.5%, consistent with both isolates being the same species H. suis.

Conclusions
The present study characterized the complete fragmented mt genome sequence of H. suis from China, which is slightly shorter (7 bp) than isolates from Australia, both mt genomes differed by 3.7% nucleotide divergence refuted our hypothesis. The present study determined the entire mt genome of H. suis from China, providing additional genetic markers for studying the molecular genetics, biology and systematics of domestic pig louse. Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.