Morphological and Molecular Characterization of ‘var. mysorensis’ form of Anopheles Stephensi

Background: Anopheles stephensi Listen (1901) is the major malaria vector in the Asia, and recently in some regions of Africa. This species includes three biological forms, namely “type”, “intermediate” and “mysorensis” with varying degree of vector competence for malaria parasites. To recognize these siblings of An. stephensi lab strain, we used the morphological features of eggs and several genetic markers i.e. Obp1 (odorant binding protein), mitochondrial oxidases subunit 1 and 2 (COI and COII), nuclear internal transcribed spacer 2 locus (ITS2). Methods: Eggs were collected from individual mosquito (n = 50) and observed for the number of ridges under stereomicroscope. DNA was extracted from female mosquitoes. After amplifying fragments by using different genetic markers (Obp1, COI, COII and ITS2), the PCR products were puried and sequenced using Sanger Sequence Technology. Phylogenetic analysis was performed after aligning query sequences against the submitted sequences in GenBank using bioinformatics software. Results: The range of ridges number on each egg oat was 12-13 that corresponds to the mysorensis form. Sequence analysis for COI, COII and ITS2 demonstrated 100%, 99.46% and 99.29% similarity of our species with other Chinese, Indian and Iranian strains of An. stephensi. All the sequences of Obp1 intron I region matched 100% with the previously submitted sequences for An. stephensi sibling C (mysorensis form) from Iran and Afghanistan. Conclusion: The current study elaborately describes the morphological and molecular details (sequence data) of the ‘mysorensis’ form of An. stephensi that could be helpful in elucidating its classication and also in its di ﬀ erentiation from other biotypes of the same and other similar anophelines species. Our ndings conrmed OBP1 as the only genetic marker that successfully recognized our lab strain (mysorensis form/ sibling C). Thus using OBP1 (alone) may be very phenomenal in similar studies in the future. Future studies should involve the development of appropriate and reliable molecular keys (for sibling species identication) complementary to morphological keys.

data) of the 'mysorensis' form of An. stephensi that could be helpful in elucidating its classi cation and also in its differentiation from other biotypes of the same and other similar anophelines species. Our ndings con rmed OBP1 as the only genetic marker that successfully recognized our lab strain (mysorensis form/ sibling C). Thus using OBP1 (alone) may be very phenomenal in similar studies in the future. Future studies should involve the development of appropriate and reliable molecular keys (for sibling species identi cation) complementary to morphological keys. Background A wide variety of medically important insects belong to cryptic species complexes that are di cult/impossible to be distinguished based on morphological features. These species are closely related but reproductively isolated and have much differences in behavior [1,2,3]. For instance, many Anopheline mosquitoes that act as vector for malaria parasites are members of species complexes [4,5,6]. The vector and none vector mosquitoes of species complexes most often exist sympatrically. Initially (1950s), only An. gambiae from Africa and An. maculipennis from Europe were identi ed as species complexes but now numerous malaria vectors are recognized as sympatric species in Asian-Paci c region [7]. Among such species complexes, Anopheles stephensi is the most dominant in the Middle East, the Indian subcontinent, and China, Myanmar, Thailand and Ethiopia i.e. Djibouti [8,9].
An. stephensi has three biological forms i.e. mysorensis, intermediate and type, based on egg dimensions and the number of ridges on the egg oat [3]. Moreover, the type biological form is an e cient vector of malaria in urban areas but mysorensis is a poor vector (highly zoophilic) and limited only to rural areas, although it is susceptible to P. vivax VK210B [3,5,10]. The intermediate biological form is reported from rural and peri-urban areas with very little information about its vectorial capacity. Indeed, this complexity and ambiguity of many Anopheles species has been contributing signi cantly to the current worsening scenario of human malaria in the Asian-Paci c region [8,11,12] with an increasing trend of global malaria cases from 217 (2016) to 219 million (2017) [13]. In addition, the huge use of insecticides has further aggravated the current scenario of global malaria [4]. To control malaria, much progress has been made recently based on genetic drive but the challenge that remains is how to drive transgenes in to the wild mosquito population because few of the ~30-40 anopheline vector mosquitoes are amenable to genetic manipulation [4]. The reproductive isolation prevents the ow of gene from one mosquito population to another [4]. Similarly, invasion of a different species mosquito from adjoining/bordering areas may have negative effects on malaria control strategies in an area, in other words, controlling process of urban type of An. stephensi may be affected negatively by the invasion of intermediate or mysorensis form and vice versa [14]. Therefore, it is very important to determine the species complex of a vector mosquito, as the differences in ecology, behavior and vector competence among different biological forms can considerably affect both disease transmission and the success of vector control methods [3]. The information regarding population genetics of An. stephensi is still limited [15]. Mitochondrial oxidases subunit 1 and 2 (COI and COII), and ribosomal internal transcribed spacer 2 (rDNA-ITS2) and domain-3 (D3) loci are the major molecular markers used in literature, but none of them have distinguished the three subtypes of An. stephensi [16]. Moreover, overlapping in number of ridges on the surface of eggs (used as morphological keys) may create ambiguity in identi cation [3]. Thus, here in addition to the morphological features of eggs, we used the COI, COII, ITS2 and OBP1 markers to determine their respective robustness in recognizing and distinguishing the sibling species of our An. stephensi lab strain. The data (sequences of our lab strain An. stephensi sibling C) generated through this study may contribute well to the knowledge on species complex (mysorensis form) of An. stephensi.

Colony maintenance
The colony of An. stephensi (Hor strain) has been maintained over 6 years in the laboratory in 30 × 30 × 30 cm cage, Sun Yat-sen University, Guangzhou, Guangdon Province of China. The rearing conditions were 28 ± 2 ℃, 70 ± 5 % RH, and a 12:12 (L: D) h photoperiod with a 10% (W/V) sugar solution. Plastic trays (30 × 40 × 8 cm) were used for larvae rearing with deionized water and fed with IAEA 2 larval food in accordance with standard procedure explained [17].
Mosquito feeding, collection and morphological study of eggs After 5-7 days adult emergence, the female mosquitoes were allowed to feed on anesthetized white mice (Kunming strain) for 30 minutes to start egg development. After feeding, about 50 engorged females were isolated and kept in individual plastic tubes (50 mL) (one mosquito/tube) with a dump paper at the bottom for eggs collection. Each tube was marked with date and number respectively. The tubes were provided with cotton soaked in 10% sugar solution. After three days, the adult female mosquitoes were processed further for molecular analysis. About 50 eggs were mounted on slide (each time) with a drop of water and examined under stereomicroscope with 40 × (bright eld illumination) magni cation to count the number of ridges on eggs (one side) as previously [15,3]. A total of 500 eggs were examined. The previously de ned criteria for identifying these three biological forms on the basis of ridges on oating eggs are 10-15 (mysorensis), 15-17 (intermediate) and 17-22 egg ridges (type) [3,18].

DNA extraction
After egg laying process, individual female mosquito from each tube was processed for DNA extraction using DNA extraction kit, Dongsheng Biotech, according to the manufacturer's instruction. Brie y, one mosquito was taken in 1.5 mL EP tube containing around 500 µL STE buffer and a small steel ball, and homogenized (50 Hzs for 30-60 seconds). Then 5 µL of this grinding solution (for each sample) was mixed with 18 µL of DNA extraction solution in a PCR tube, mixed well and incubated for 2 minutes at room temperature. These samples were subsequently processed for PCR with thermal condition at 95℃ for 10 minutes. Afterwards, 2 µL of neutralizing uid was added to each PCR product, mixed well and incubated for several minutes at room temperature. Finally, the extracted DNA was either kept at -20 ℃ or immediately processed for further ampli cation of target gene.
Ampli cation of COI, COII, ITS2, and OBP1 fragments We performed PCR for individual mosquito (n = 40) to amplify COI, COII, ITS2 and OBP1 genes, respectively. The details of used primers and PCR ampli cations pro les for each marker are presented in Table 1. The PCR reactions were carried out in 25 μL volumes. ddH 2 O was used instead of sample DNA in PCR reactions as negative control. The PCR products were puri ed with TaKaRa agarose Gel DNA extraction Kit (Japan) and the amplicons were subsequently sequenced (in both directions) using Sanger sequencing technology.  [20]. For OBP1, the tree was constructed based on intron I sequences. Nucleotide sequences will be available in the GenBank, after submission to NCBI. There are 101 sequences of An. stephensi COI sequences submitted to GenBank, of them, seven sequences (from Iran, India, China and Brazil) were compatible with our lab strain and included in the sequence analysis. The COII sequences of An. stephensi (n = 24) extracted from GenBank and synchronized with lab strain sequence. Four sequences were excluded because of shorter sequence. An. stephensi rDNA-ITS2 sequences were extracted from GenBank (n = 118). They were submitted from Iran, India, Iraq, Saudi Arabia and Sri Lanka. Eleven sequences from Sri Lanka were partial and therefore, excluded from the study. Finally, 37 representative rDNA-ITS2 sequences from GenBank and 2 new lab strain (current study) sequences were used for sequence analysis and phylogenetic tree construction.
A 120 bp fragment of OBP1 intron I region from sequenced specimens (n = 9) selected from 845 bp sequenced region were used for analysis. An. stephensi sibling species sequences, A (KJ557463), B (KJ557452), C (KJ557455) [16] used as representative for and sequence comparisons and phylogenetic tree construction.

Genetic Analysis
We sequenced 13 samples randomly selected. Among these, one sample for each of COI and COII, two for ITS2 while 9 samples were for OBP1. The size of sequenced COI region in lab strain was 839 bp, but 758 bp was synchronized with GenBank sequences and used for analysis and phylogenetic tree construction.
Multiple sequence alignment showed that the similarity between lab strain COI sequence and GenBank sequences was 99.87-100%. There were 7 mismatches in COI sequences as a transversion and 6 transitions. Interestingly, a sequence directly submitted from China by Zou et al., (2015) [22] was 100% similar to our lab strain COI sequence (Fig. 1).
Multiple sequence alignment showed 97.63-100% similarity within species except HQ703001 from India. Interestingly, it showed 82.19-82.97% similarity with other rDNA-ITS2 sequences of An. stephensi. Comparisons of 2 new lab strain sequences showed 98.71% similarity, while they were randomly selected from the same colony. The similarity of new sequences with other sequences was 97.63-99.57% (Fig. 3).
Multiple sequence alignment showed 100% similarity of intron I sequences among nine An. stephensi specimens. They were 100% similar with An. stephensi sibling C (mysorensis), while their similarity with An. stephensi sibling A and B was 85% and 75.65%, respectively (Fig. 4).
Interestingly, COI sequence of An. stephensi from Iran and India, Brazil and China distributed in 3 different clades in constructed phylogenetic tree (Fig. 5). Lab strain sequence was placed with Chines An. stephensi sequence in the same clade. Phylogenetic tree constructed based on An. stephensi COII sequences categorized the sequences in 3 clades (Fig. 6). Lab strain sequence was placed in a separate clade together with An. stephensi COII sequences from India and Iran (Fig 6). When phylogenetic tree was constructed based on rDNA-ITS2 sequences of An. stephensi, the topology of tree was similar to COI and COII and contains 3 clades with lower bootstrap values for clades (Fig. 7).
Sequences of the current study were placed in a separate clade together with An. stephensi sibling C (mysorensis) in phylogenetic tree constructed based on OBP1 intro I sequence (Fig. 8). Whereas like the tree constructed in previous studies [15,16] An. stephensi sibling A and B were placed in the separate clades (Fig. 8).

Discussion
Anopheles stephensi is a competent malaria vector in Iran, Afghanistan, China, the Middle-East, the Far-East and Indian subcontinent [2]. Besides, it has also been implicated recently in resurgence of urban malaria in Africa [9,23] but with no details about species complex. Based on malaria transmission competence of three biological forms of An. stephensi, the Type form is reported as main malaria vector in urban areas of Indian subcontinent [24]. On the other hand, many reports have mentioned all these three forms as e cient vectors for malaria transmission in Iran [6,16]. Despite of e cient controlling strategies for malaria, this mosquito is increasing its geographic range. This may be attributed to the lack of precise recognition of vector Anopheles species complexes which is crucial in malaria surveillance, control, and eradication strategies [16]. Earlier studies used different methods to identify these complex forms of An. stephensi for example, counting ridges number on the eggs [18], cuticle hydrocarbons [25], chromosome karyotypes [3,26,27], ITS2, D3 loci, CO1 and COII markers [28]. But then again, the inconsistent range of ridge numbers, identical nucleotide sequences of ITS2 and D3 loci in type and mysorensis, and the identity of COII sequences between Indian type form and Iranian mysorensis suggest these markers are not suitable for the identi cation of biological forms of An. stepehnsi [24,28,29,30,31]. Our analysis also supports these results since we were unable to recognize (the biological form) our lab strain using COI, COII and ITS2. Taking together, our observations suggest these markers may be e cient in determining the inter-species (anopheles) differences but intra-species variations (sibling species of An. stephensi) can only be recognized with more speci ed markers i.e. OBP1. These markers will be crucial to help develop appropriate and reliable molecular keys (for sibling species identi cation) complementary to morphological keys [24]. Whereas morphology based identi cation may be an error prone (in case of overlapping ridges number in the intermediate biological form) process in a close resembled sibling species like An. stephensi. It was recently shown that Ansteobp1 intron I sequence could be used to solve this dilemma [16]. In a previous study, intron I sequences of OBP1 successfully distinguished the three biological forms despite 100% similarity in amino acid level [15]. Here, the morphological analysis of mosquito eggs with ridges number 12-13/egg corresponds to mysorensis form of An. stephensi which is in agreement with [3,18] who documented ridges number 10-15 for mysorensis. Likewise, phylogenetic analysis (Fig. 8) showed 100% similarity with mysorensis biological form (sibling C) reported from Iran and Afghanistan, the neighboring countries of China.
The COI and COII, ITS2 and D3 based analysis from Iran and India have revealed extensive gene ow among these variants [30,31] whereas other genetic studies (using microsatellite markers) have demonstrated a signi cant genetic differentiation and non-signi cant gene ow among them with type and intermediate being more closely related genetically [32]. These biotypes with taxonomic identities and differential vector competencies may contribute to the current epidemiological situations [24]. For example, invasion of An. stephensi has attributed signi cantly in the recent malaria outbreaks in nonendemic areas of India i.e. Kerala state and Lakshadweep islands [33]. Similarly, this species invaded Sri Lanka (from India via water carrying eggs in boats) through the narrow Palk Strait (southward) [33,34].  [36]. Myanmar, where An. stephensi is the major malaria vector, may play a signi cant role in sharing this species to the contiguous parts of China. These scenarios are likely to impede the goal of malaria eradication from China. It is exciting to mention that earlier only two malaria parasites (Plasmodium falciparum and P. vivax) were prevailing but after control and elimination operations all the four parasites (P. falciparum, P. vivax, P. malariae, and P. ovale) are reported from China [37,38]. Similarly, the geographic distribution range/proportion of these vectors has also expanded/increased [36,37,38,39]. Therefore, we recommend further sustained entomological surveillance with precise species complexes before and after malaria elimination certi cation.

Conclusions
Here we found OBP1 is a robust genetic marker to identi cation of An. stephensi complexes than other routine molecular markers and con rming the presence of An. stephensi sibling C (mysorensis biological form) in our lab. Thus our study provides information on morphological and molecular characterization of mysorensis biological form. Conducting comprehensive entomological surveillance with precise identi cation of species complexes of An. stephensi will have signi cant contribution in the ongoing malaria control strategies. Moreover, further studies should determine the possible divergence among these biological forms (An. stephensi) and their relative vector competence for human malaria parasites.
Declarations Figure 1 Multiple sequence analysis of COI sequence from lab strain. Bootstrap values >70 shown at nodes.

Figure 2
Multiples sequence analysis of COII sequence from lab strain. Bootstrap values >70 shown at nodes.  Multiple sequence alignment of OBP1 partial sequence of lab strain with tree known biological forms of An. stephensi.

Figure 5
Phylogeny of COI sequence from lab strain. Bootstrap values >70 shown at nodes. Nodes without numbers had a value <70. Final ML Optimization Likelihood: -1252.592081.   Phylogenetic three constructed based on OBP1 intron I region of An. stephensi sequences from Lab strain China and representative sequences from GenBank