Comparative Analysis of the Mitochondrial Genome Sequences of two Medicinal Plants: Arctium Lappa and A. Tomentosum

Backguound: Mitochondrial genome sequence analysis is of great signicance for understanding the evolution and genome structure of different plant species. Arctium lappa and A. tomentosum are distributed in China and frequently used as medicinal plants. People usually think A. tomentosum is an adulterant or substitute of A. lappa as a traditional Chinese Medicine (TCM). It is therefore critically important to identify the different species that are utilized in medicinal applications. This study aims to determine and compare their mitochondrial genomes, gene structure, and phylogenetic relationship. These results may provide additional insights into development of genetic research. Results: We determined the complete sequences of the mitochondrial genomes of A. lappa and A. tomentosum for the rst time. The mitochondrial genomes of A. lappa and A. tomentosum were assembled into 2 single circular molecules of 312598 bp and 312609 bp, respectively. A total of 131 and 130 genes were annotated in two plants. 50 pairs of large repeat sequences were detected in A. lappa and A. tomentosum. The number of simple sequence repeats (SSRs) in both species was 192 while the total length of SSR was 2491 bp for A. lappa and 2489 bp for A. tomentosum. Only 51 single nucleotide polymorphisms (SNPs) and 3 insertion-deletions (InDels) were detected between the two plants. The two mitochondrial genome structures were highly similar and highly collinear. Both of the chloroplast genomes and mitochondrial genomes of the two plants had the phenomenon of gene exchange and transfer. Core genes and specic genes were analyzed for A. lappa and A. tomentosum and three closely related Asteraceae species, the specic gene of A. lappa was orf115a. In addition, a phylogenetic tree of the mitochondrial genomes were constructed, which laced the two Arctium species into one branch within Asteraceae. Conclusions: We identied and analyzed the mitochondrial genome features of two species of Arctium in China with implications for species identication and phylogenetic analysis. The mitochondrial genomes of A. lappa and A. tomentosum were very similar in size and structure. The ORF genes of the two were different, which could provide a theoretical basis reverse, C: complexity, P: palindromic.


Background
As a semi-autonomous organelle in eukaryotic cells, mitochondria are the main site of intracellular oxidative phosphorylation and synthesis of adenosine triphosphate (ATP), and its genome is independent of the nuclear genome. As sequencing technology matures, the mitochondrial genome of many crops has been sequenced. It is found that the mitochondrial genome of Cucumis melo L. (2500 kb) is the largest mitochondrial genome of all crops and plants that have been sequenced [1]. And a kind of medicinal plant, Marchantia polymorpha L. (183 kb), is the smallest [2]. The in-depth study also found that crop cytoplasmic male sterility (CMS) is closely related to the structural changes of the mitochondrial genome [3], and CMS is widely used in heterosis breeding to signi cantly increase crop yield [4][5][6][7][8]. Therefore, it is important to study the mitochondrial genome of crops and plants. How to develop high-yield and highquality new varieties of crops is an important topic in agricultural research.
The mitochondrial genome of plants is usually similar to the chloroplast genome and has a circular molecular structure, which is complex and highly variable, with abundant non coding regions and introns.
Studies have shown that plant mitochondrial genome is a mixture of DNA molecules with different shapes [9]. The mitochondrial genomes of chrysanthemum, sun ower and diplosephium hartweegii are all circular [10][11], In addition, there are also plant mitochondrial genomes with multiple rings, such as wheat, rape, etc. [12][13]. The complexity of genome structure is related to the number, location and direction of the recombinant active repeat region on mitochondria [1]. The secondary structure of mitochondrial genome is complex and changeable, but its sequence is relatively xed, which has little in uence on medicinal plant identi cation and research. The size, structure and gene sequence of mitochondrial genome in angiosperms vary greatly [14][15][16]. There are signi cant differences in narrow taxonomic span [17]. The mitochondrial genome of plants is much larger than that of animals, ranging from 200 to 2500 KB, with a variation of more than 10 times [15]. In most angiosperms, the size of mitochondrial genome is concentrated at 300-600 KB. At present, a large number of mitochondrial genomes of plants have been reported, which lays a foundation for comparative analysis of mitochondrial genomes among different plants. The mitochondrial genome of angiosperms can not only reveal the phylogenetic relationship between species, but also be used to study intraspeci c differentiation [18]. It is necessary to study the differences between different species or even subspecies of mitochondrial genome of higher plants, the position of mitochondria in biological evolution, and better understand the evolution of higher plants and their differentiation time, and give full play to their practical application in genetic breeding, which is advantageous to species with economic bene ts, representative species in biological evolution, and scienti c research. Large scale mitochondrial genome sequencing of model species is very necessary and meaningful.
Most of traditional Chinese medicine comes from natural plants. Due to various reasons, there has been a variety confusion and uneven quality of traditional Chinese medicine for a long time, which seriously affects the effectiveness, stability and safety of traditional Chinese medicine. Research and sort out the varieties of traditional Chinese medicine has always been one of the basic tasks of the research. It is limited to identify the original species of traditional Chinese medicine only based on traditional methods like morphological observation and microscopic observation. In recent years, the application of DNA barcode technology in the identi cation of traditional Chinese medicine has made rapid progress, and some achievements have been made. The molecular research on the original plants of traditional Chinese medicine has gradually become a hot topic. Researchers used mitochondrial genes nad1/b-c and nad5/de to study species identi cation of Gentiana macrophylla [19].
There are approximately 11 species of Arctium (Asteraceae) in the world, and they are widely distributed in temperate regions of Eurasia. Arctium lappa and A. tomentosum are distributed in China and used as medicinal plants as well as snacks [20][21]. A. lappa, known as burdock, is called "Niu Bang" in Chinese. Arctii Fructus is dried ripe fruit of burdock as a traditional Chinese medicine, which is included in Chinese Pharmacopoeia 2015 edition [22]. Its main effects include dispelling wind-heat, clearing the lungs, draining skin eruptions, detoxifying, and relieving a sore throat and its leaves and stems are eaten raw as a snack or stewed [23]. Arctiin and its aglucone obtained from the tap roots of burdock can interfere with early stages of replication of the avian in uenza virus and can also hamper progeny virus release in mammalian cells [24][25]. Although burdock has a large amount of cultivation in China, its yield is not ideal enough, and there is no good economic bene t. A. tomentosum, known as cotton/woolly burdock, is a biennial wild plant species and is distributed worldwide and used as food and rich sources of secondary metabolites for the pharmaceutical industry [24]. A. tomentosum (which is not included in Chinese Pharmacopoeia 2015 edition) is widely used in Xinjiang province, which belongs to the characteristic national medicine in China, but people usually think A. tomentosum is an adulterant or substitute of A. lappa. At present, the genetic research of Arctium plants in China is mainly focused on the identi cation of internal transcribed spacer sequence and the study of whole chloroplast genome sequence [20].
In this study, we determined the complete sequences of the mitochondrial genomes of A. lappa and A. tomentosum and compared their structures, and with that of three other Asteraceae plants at the two levels of genome and gene, including syntenic sequence analysis, SNP, InDel, SV detection, Core-pan analysis and species phylogenetic analysis. We also analyzed the transfer of mitochondrial genes and chloroplast genes between the two medicinal plants. Hope the results and conclusions can provide a theoretical basis for their identi cation, molecular markers and genetic breeding.

Plant materials and mitochondrial DNA extraction
The fruits of A. lappa and A. tomentosum were collected from the herb garden of Liaoning University of traditional Chinese medicine (E 121° 52′, N 39° 03′ ) and from Urumqi, Xinjiang (E 84° 33′, N 44° 07′ ), both the plant fruits samples had the permission for biological experiments. Professor Tingguo Kang at Liaoning University of Traditional Chinese Medicine, identi ed the certi cate specimens (A. lappa number: 10162190625306LY, A. tomentosum number: 10162190625307LY ), plant samples were deposited in the herbarium of Liaoning University of Traditional Chinese Medicine and the genomic DNA was stored in the Key Laboratory of Traditional Chinese Medicine in the University (Dalian, China, 116600), fruits were cultured in dark under the condition of 23 ℃ and 60% humidity. When the seedling weight was 3-5 g, the whole plant was collected, washed, frozen with liquid nitrogen, and then stored at -80 ℃ for standby. We used an improved extraction method [26] for the mitochondrial DNA isolation.

Mitochondria Dna Sequencing And Genome Assembly
After DNA isolation, 1 µg of puri ed DNA was fragmented to construct short-insert libraries (insert size 430 bp) according to the manufacturer's instructions (Illumina), then sequenced on the Illumina Hiseq 4000 [27] (Shanghai BIOZERON Co., Ltd). The high molecular weight DNA was puri ed and used for PacBio library prep, BluePippin size selection, then sequencend on the Sequel Sequencer.
Prior to assembly, Illumina raw reads were ltered rstly. This ltering step was performed in order to remove the reads with adaptors, the reads showing a quality score below 20(Q < 20), the reads containing a percentage of uncalled based ("N" characters) equal or greater than 10% and the duplicated sequences.
The mitochondria genome was reconstructed using a combination of the Pacbio Sequel data and the Illumina Hiseq data, and the following three steps were used to assemble mitochondria genomes. First, Assemble the genome framework by the both Illumina and Pacbio data using SPAdes v3.10.1 [28]. Second, verifying the assembly and completing the circle or linear characteristic of the mitochondria genome, lling gaps if there were. Third, clean reads were mapped to the assembled mitochondria genome to correct the wrong bases, judge if there is any insertion and deletion.

Comparative Analysis Of The Mitochondrial Genomes
Large repeat sequence analysis We used the software REPuter (http://bibiserv.techpak.uni-bielefeld.de/computer/), the minimum sequence length was 30 bp, and the editing distance was 3. Use the following four repetition methods to nd the long repetition sequence: F: forward, R: reverse, C: complexity, P: palindromic.
Snp And Indel Detection SNP mainly refers to DNA sequence polymorphism caused by single nucleotide variation at the genome level. InDel refers to the insertion and deletion sequence of small segments in genome. In this section, in order to identify sequence variations in the known genes as well as the ORFs between A. lappa and A. tomentosum mitochondrial sequences, SNPs and InDels were detected. The SNP annotation results (Additional le 10: Table S10) of A. lappa with A. tomentosum as reference sequence showed that the SNPs in coding region had no synonymous mutation at start codon and synonymous mutation at stop codon, nor nonsynonymous mutation at start codon and nonsynonymous mutation at stop codon, and no nonsense mutation at the same time. There were 6 synonymous mutations and 13 nonsynonymous mutations in the gene. There were 32 intergenic mutations. InDel annotation results (Additional le 11: Table S11) of A. lappa with A. tomentosum as reference sequence showed that only 3 InDels existed between genes, and the results of gene mutation type caused by InDel with A. tomentosum as reference sequence showed that there were no gene mutation existed.
Syntenic sequence analysis and SV analysis between mitochondrial genomes of A. lappa and A.
tomentosum Through the syntenic sequence between the genomes, we can observe the insertion and deletion of the sequence between the genome of the target species and the reference genome. From the statistical table (Table 3) of collinear comparison coverage of ve Asteraceae plants, we could nd A. lappa and A. tomentosum mitochondrial genome sequence alignment regions accounted for 100% of the whole genome, and Fig. 4 illustrated that the two showed a complete collinearity. We compared the mitochondrial genomes of A. lappa and A. tomentosum with those of three other Asteraceae plants, and the results showed a very high similarity. Arctium species differed greatly from other 3 Asteraceae plants in their mitochondrial genomes, in which the percentage of regions aligned with Helianthus annuus mitochondrial genome sequences was only 51.7% of the whole genome, and the number of comparison blocks was 96. Additional le 14-15 Figure S1-2 showed that the mitochondrial genomes of Arctium plants had lower collinearity with that of Helianthus annuus and slightly higher collinearity with that of Chrysanthemum boreale (74.56% and 74.53%). This indicated that the mitochondrial genomes of plants in different genera varied greatly even in the one Asteraceae family. The microbial genome has dense functional genes, and the occurrence of structural variations will cause the loss or alteration of multiple gene functions, resulting in changes in microbial phenotypes, functional differences and pathogenicity. For further clarifying the difference between the two mitochondrial genomes, SV were investigated between A. lappa and A. tomentosum using A. tomentosum as reference (Fig. 5). No translocations, inversions, translocations and deletions were found in the mitochondrial genomes of A. lappa and A. tomentosum. The results indicated that the mitochondrial genome similarity between A. lappa and A. tomentosum was very high.
Gene transfer between chloroplast genome and mitochondrial genome in two Arctium plants Biomass (information) exchanges occur between subcellular units or organelles in eukaryotic cells to coordinately regulate various life activities of cells [48]. Recent studies have shown that information exchange and transfer between chloroplasts and mitochondria exists in plants and induces the occurrence of PCD (programmed cell death), but the mechanism of action has not been fully analyzed [49]. Both of the chloroplast genomes and mitochondrial genomes of A. lappa and A. tomentosum had the phenomenon of gene exchange and transfer ( Fig. 6-7). Among them, 48 transfer segments with a similarity of not less than 80% we calculated had a total length of 8229 bp, the shortest transfer segments length was 45 bp, and the longest transfer segments length was 2532 bp, respectively (Additional le 12-13: Table S12-13). This phenomenon was found in other plants such as Ginkgo biloba and Salvia miltiorrhiza.

Core, Speci c and Pan Gene analysis
The homologous genes present in all samples are regarded as common genes (Core genes), after removing the common genes, the non-common genes (Dispensable genes) are obtained, and the speci c genes are the only genes that are speci cally owned by the sample. All non-shared genes are merged with shared genes as the Pan genes. Among them, the core gene and the speci c gene are likely to correspond to the commonality and characteristics of samples, which can be used as the basis for the study of functional differences between samples. Core genes and speci c genes were analyzed for A. lappa and A. tomentosum and three other Asteraceae plants (Fig. 8). There were 354 genes and 22 core genes for these ve Asteraceae plants, and 1, 2, 0, 1 and 0 speci c genes for A. lappa, Helianthus annuus, Chrysanthemum boreale, Diplostephium hartwegii and A. tomentosum, respectively. There were 95 Dispensable genes. Among them, the speci c genes of A. lappa were orf115a, orf873 and rps12 for Helianthus annuus, and rps19 for Diplostephium hartwegii. The number of Pan genes was 117. These core and speci c genes were likely to correspond to the commonality and characteristics of these ve plants, which can provide a basis for the study of functional differences among different species.
MUMmer software was used to compare the 5 genomes, including A. lappa, A. tomentosum genomes and 3 Asteraceae plants genomes downloaded from the NCBI (Chrysanthemum Boreale, genbank accession number: NC039757; diplosephium hartwegii, genbank accession number: NC034354; Helianthus annuus, genbank accession number: NC023337), and the large-range collinearity relationship between genomes was determined. Later, LASTZ was used to compare the regions, con rm the local position arrangement relationship, and nd the regions of translocation (Translocation/Trans), inversion (Inversion/Inv), and translocation + inversion (Trans + Inv).
The mitochondrial genomes of A. lappa and A. tomentosum were compared using MUMmer software, and then the regions were compared using LASTZ to nd the SV from the regional comparison results.
Gene transfer between chloroplast genome and mitochondrial genome The chloroplasts of A. lappa and A. tomentosum were compared with their mitochondrial genomes by BLSTN, respectively. The selected parameter E value was less than 1e − 10 .

Core, Speci c and Pan Gene analysis
Using cd-hit (v4.6.1, http://cd-hit.org) software, the protein sequences of multiple samples to be analyzed were clustered, and the screening parameters for Identity and alignment length (requiring identity > 50% and coverage > 50% of clustering) were set. The clustering of all protein sequences was obtained according to the results of software analysis. Core and Pan gene sets were constructed by comparing the protein sequences of A. lappa, A. tomentosum genomes and 3 Asteraceae plants genomes.

Phylogenetic Analysis
In order to determine the phylogenetic position between Arctium and other genera in Asteraceae, phylogenetic tree was constructed based on the whole mitochondrial genomes of 28 species, including A. lappa and A. tomentosum, 25 species of the other genera in Asteraceae and other Asterids plants, 1 species of the outgroup (Ginkgo biloba). All GenBank accession numbers were listed (Additional le 1: Table S1). The PhyML V3.0 software was used to construct a phylogenetic tree by maximum likelihood method (ML), bayes correction, 1000 bootstrap replicates to calculate bootstrap values [41].

Results
Sequencing and assembly of the mitochondrial genome in A. lappa and A. tomentosum

Annotation and general characteristics of A. lappa and A. tomentosum mitochondrial genome
After the annotation, the master circles images were generated from the software OrganellarGenomeDRAW v1.2 [40], and 131 genes including 98 ORFs, 6 ribosomal RNAs and 25 transfer RNAs were identi ed in A. lappa. 130 genes including 97 ORFs, 6 ribosomal RNAs and 25 transfer RNAs were identi ed in A. tomentosum. (Fig. 1.2) Compared with A. lappa, the number of protein coding sequences in A. tomentosum annotated in NR and GO databases was more than 1, and the number of other databases annotated in A. lappa was the same. The mitochondrial genomes of A. lappa and A. tomentosum were assembled into 2 single circular molecules of 312598 bp and 312609 bp, respectively (Table 1). From the results, A. lappa had only one more gene than A. tomentosum. As to the known genes in A. lappa and A. tomentosum mitochondrial genomes, there was one-to-one correspondence with each other [3]. The ORF encoding proteins that were equal to or larger than 100 aas were identi ed from the assembled mitochondrial genomes. ORFs and other component of the two genomes were listed in the Additional le 4: Table S4. There were 9 introns in the mitochondrial protein coding genes of A. lappa and 21 introns in the whole mitochondrial genome in total. The numbers of that in A. tomentosum were 8 and 20. Among them, rps3 gene of A. lappa contained 1 intron, while the gene of A. tomentosum had no introns. The number of introns and exons in other two genes was the same, and the length was the same or similar. nad7, nad1 and nad2 had 4 introns and 5 exons (Additional le 5: Table S5). Comparative analysis of the mitochondrial genomes

Repeat sequence analysis
Large repeat analysis Many repeats are present in gene deserts, although whole-genome sequencing has shown that they can occur in functional regions as well [42][43]. Repeat of more than 30 bases were considered as the large repeats. From the results (Additional le 6-7: SSR analysis SSR is a PCR-based highly e cient molecular labeling technique [44]. Single parental inheritance, together with other characteristics, has been extensively used in species identi cation and genetic diversity analysis. The content of SSR in eukaryote genome is very rich, and it is often randomly distributed in nuclear DNA. SSR is also abundant in plants and evenly distributed in the whole plant genome, but the frequency of SSR in different plants varies greatly. Because of the characteristics of neutral markers, the highly variable numbers of repeats and the relative conservatism of anking sequences of SSRs, it is widely distributed in the genome of organisms. The technique is easy to operate and has high repeatability and codominant inheritance among alleles. SSRs marker technique is the best choice for evaluating genetic diversity of crops [45][46][47]. In this analysis, the minimum distance between two SSRs was set to 100 bp. From the    [20], Diplostephium hartwegii, Chrysanthemum boreale and Helianthus annuus were clustered in one clade and the rst two species had a closer relationship, implying that they are very closely related. At the same time, they were clustered in one large clade with A. lappa and A. tomentosum. In addition, other closely related species had similar evolutionary locations. Three species of Nicotiana, two species of Solanum were also clustered into one branch. Bupleurum falcatum and Daucus carota, two species belonging to the Umbelliferae family, did not cluster together. Bupleurum falcatum and Vaccinium macrocarpon clustered together, but the bootstrap value was only 79, while Daucus carotat clustered with Asteraceae. Such data will provide certain help to subsequent botany for classifying plants.

Discussion
General characteristics of A. lappa and A. tomentosum mitochondrial genome By measuring the mitochondrial genomes of A. lappa and A. tomentosum, we found that the mitochondrial genome structures of the two species were very similar. The mitochondrial genome length of the two species was more than 30 kb, which was longer than that of the previously reported Asteraceae plants, Chrysanthemum boreale (211002 bp), Helianthus annuus (300104 bp) and Diplostephium hartwegii (277718 bp) [11], less than the medicinal plant Salvia miltiorrhiza (499236 bp) [50] etc, the mitochondrial genome lengths of plants in different families and genera differed greatly, but the mitochondrial genome sizes of the same genera (Arctium) differed little from those of A. lappa and A.
tomentosum. In addition, the mitochondrial genome structure of A. lappa and A. tomentosum was very similar, including GC content, the composition of coding genes, and the composition of RNA. There was only one ORF gene difference between the two species. The similarity of mitochondrial genomes in tomatoes and their wild varieties could reach to 98%, but the mitochondrial genomes of 138A male sterile line and its maintainer line 138B in pepper were quite different [3]. This might be related to the low rate of variation in the mitochondrial genome and the occurrence of gene recombination and rearrangement [11], which we might also indicate that A. lappa and A. tomentosum originate from the same maternal line and later evolve into different two species. In addition, the annotated genes of A. lappa and A. tomentosum were very similar, A. lappa had only one orf115a gene than that in A. tomentosum, in addition, the conserved genes of both were similar to those of Lactuca plants, A. lappa and A. tomentosum lacked rpl16 gene, had one more gene, atp8, and they all had pseudogene sdh4.
The Open Reading Frame (ORF) starts at the start codon, is a sequence in DNA sequence with the potential to encode proteins, and ends at a sequence of bases that is contiguous with the stop codon. A. lappa and A. tomentosum were annotated to 98 and 97 ORF genes, respectively, which differed greatly from the number of some reported plants, such as melon with only 2 [51], these genes were the basis of our future research on the two Arctium plants. In addition, A. lappa and A. tomentosum had 25% GC content, which was lower than that of other plants, such as the mitochondrial genome GC content of Saussurea involucrata (29%) and Ginkgo biloba mitochondrial genome GC content (50.36%) [52][53], which might be related to the stability of the mitochondrial DNA. Introns in eukaryotes could be used to study phylogenetic evolution, evolutionary distance and gene expression regulation [54]. Intronscontaining genes had been found in the mitochondrial genes of A. lappa and A. tomentosum. The main difference between them was that the rps3 gene, A. lappa had introns inside, while A. tomentosum genome did not have introns inside the rps3. This gene also had one intron in Ginkgo biloba, and the gene nad1 also had four introns [53], suggested that mitochondrial genes might lose introns during the evolution, and this phenomenon could be used to study the evolutionary relationship of species, as well as classi cation and identi cation.

Comparative analysis of the mitochondrial genomes
The presence of a large number of repetitive sequences in the mitochondrial genome could promote the occurrence of intramolecular sequence recombination. As a result, isomeric genomic sequences in mitochondria and sub-genomic sequences derived from some small circular molecular structures were generated [55]. SSRs sequences were abundant in plants with strong repeatability, simple genome structure, and relatively conserved, which were widely used in species identi cation and genetic analysis of individuals and populations [56][57]. As an example, some researchers had used mitochondrial SSR to study the genetic diversity of Salvia miltiorrhiza from different origins, and the research showed that Salvia miltiorrhiza from different origins had genetic diversity, and the average versatility was high [58]. Both A. lappa and A. tomentosum mitochondrial genomes contained a large number of SSRs, which could be used to analyze their genetic diversity and lay a foundation for their quality research. The mitochondrial genome repeats of A. lappa and A. tomentosum were basically identical, which indicated that the mitochondrial genome could remain relatively static for a long time, but could also show diversity due to rearrangement, sequence loss and unknown sequence [11]. Therefore, the analysis of SNP (single nucleotide polymorphism) and InDel of A. lappa mitochondrial genome with the mitochondrial genome of A. tomentosum as reference could reveal the diversity of their mitochondrial genomes. SNPs were highly stable, while SNPs located in coding regions were more stable [59], which could be used to analyze the evolutionary relationship of species as well as identi cation and typing between or within species. SNPs existed between A. lappa and A. tomentosum, and we could use these SNPs to identify the interspecies between them. There were only 3 small fragments of InDel in A. lappa and A. tomentosum, and both of them were in the nad1 gene. This was consistent with the fact that a large number of small fragments of SNPs had been found in both sterile and fertile lines of Hibiscus cannabinus, and most of them were different between genes [60], indicated that the differences of InDel among species of Arctium were relatively small.

Conclusions
The mitochondrial genomes of A. lappa and A. tomentosum were similar in size, 312598 bp and 312609 bp, respectively, and the number of annotated genes was 131 and 130, respectively. The ORF genes of the two were different, which could provide a theoretical basis for the development of molecular markers. Both mitochondrial genomes contained a certain number of SSRs and large repeat sequences, which could provide a basis for future genetic diversity and identi cation analysis of both. The mitochondrial genomes of the two species had gene exchange and transfer with their own chloroplast genomes, respectively. In addition, the mitochondrial genomes of the two species showed complete collinearity, but they differed greatly from the mitochondrial genomes of three plants of different genera in the same family, with translocation, inversion, translocation and inversion. The evolutionary position of A. lappa and A. tomentosum in evolutionary analysis was the same as that of traditional classi cation. In future studies, we can search for related genes of Arctium plants molecular identi cation barcode and molecular breeding based on their mitochondrial genomic characteristics. Declarations Figure 1 Mitochondrial genome map of A. lappa.

Figure 2
Mitochondrial genome map of A. tomentosum, the outer circle was the position coordinates of the genome components such as genes and non-coding RNAs, and had corresponding gene names; the inner circle was the genomic GC content.   Gene exchange in chloroplast genome and mitochondrial genome of A. lappa, the color of the connecting line is selected according to ratio = score/max, and the score is differently colored in the corresponding range blue <= 0.25, green <= 0.50, orange <= 0.75, red > 0.75. The warmer the color, the higher the similarity.chl is the abbreviation of chloroplast genome and mito is the abbreviation of mitochondrial genome.

Figure 7
Gene exchange in chloroplast genome and mitochondrial genome of A. tomentosum, the color of the connecting line is selected according to ratio = score/max, and the score is differently colored in the corresponding range blue <= 0.25, green <= 0.50, orange <= 0.75, red > 0.75. The warmer the color, the higher the similarity.chl is the abbreviation of chloroplast genome and mito is the abbreviation of mitochondrial genome.
Page 30/32 Molecular phylogenetic tree of 28 species based on the whole mitochondrial genomes. The tree was constructed via a maximum likelihood analysis using PhyML v3.0 with 1000 bootstrap replications.