The Anopheles coluzzii range extends into Kenya: Detection, insecticide resistance profiles and population genetic structure in relation to conspecific populations in West and Central Africa

Background Anopheles coluzzii is a primary vector of malaria found in West and Central Africa, but its presence has hitherto never been documented in Kenya. A thorough understanding of vector bionomics is important as it enables the implementation of targeted and effective vector control interventions. Malaria vector surveillance efforts in the country have tended to focus on historically known primary vectors. In the current study, we sought to determine the taxonomic status of samples collected from five different malaria epidemiological zones in Kenya as well asdescribe the population genetic structure and insecticide resistance profiles in relation to other An. coluzzi populations. Methods Mosquitoes were sampled as larvae from Busia, Kwale, Turkana, Kirinyaga and Kiambu counties, representing the range of malaria endemicities in Kenya, in 2019 and 2021 and emergent adults analysed using Whole Genome Sequencing data processed in accordance with the Anopheles gambiae 1000 Genomes Project phase 3. Where available, historical samples from the same sites were included for WGS. Results This study reports the detection of Anopheles coluzzii for the first time in Kenya. The species was detected in Turkana County across all three time points sampled and its presence confirmed through taxonomic analysis. Additionally, we found a lack of strong population genetic differentiation between An. coluzzii from Kenya and those from the more northerly regions of West and Central Africa, suggesting they represent a connected extension to the known species range. Mutations associated with target-site resistance to DDT and pyrethroids and metabolic resistance to DDT were found at high frequencies of ~60%. The profile and frequencies of the variants observed were similar to An. coluzzii from West and Central Africa but the ace-1 mutation linked to organophosphate and carbamate resistance present in An. coluzzii from coastal West Africa was absent in Kenya. Conclusions These findings emphasise the need for the incorporation of genomics in comprehensive and routine vector surveillance to inform on the range of malaria vector species, and their insecticide resistance status to inform the choice of effective vector control approaches.


Introduction
Malaria is transmitted through the infectious bite of the Anopheles female mosquito and is a major cause of morbidity and mortality in Kenya and sub-Saharan Africa in general.In 2022, there were an estimated 249 million cases of malaria worldwide with 233 million of these occurring in the WHO African region and accounting for about 94% of all cases (1).In Kenya, approximately 70% of the population is at risk of malaria with the disease accounting for an estimated 13 -15% of outpatient consultations (2).Despite concerted global efforts to control malaria, elimination remains a challenge in both low and high burden settings.Historically, Anopheles gambiae and Anopheles funestus species complexes have been known to transmit malaria in Kenya.Within the Anopheles gambiae complex, An. gambiae sensu stricto (s.s.) and An.arabiensis were considered the major vectors, with An. merus contributing to transmission in coastal Kenya.Invasive An. stephensi has also recently been detected in the country but it's contribution to malaria control is yet to be evaluated (3).
Anopheles coluzzii is another member of the An.gambiae s.l.species complex which is morphologically indistinguishable from at least ten sibling species of malaria in sub-Saharan Africa (SSA).It is responsible for a signi cant proportion of the malaria transmission across SSA along with An. gambiae and An.arabiensis, although An. funestus is of increasing concern in East and Southern Africa (4).Anopheles coluzzii was formally named in 2013 following accumulating evidence of subdivision between the previously described M (Mopti) and S (Savana) molecular forms of An. gambiae, evidence whichincluded the presence of pre-mating barriers and genome-wide divergence and independent evolutionary trajectories; consequently, the M form was assigned the name An. coluzzii and the S form An. gambiae (5).Anopheles coluzzii is widely distributed in West and Central Africa and found in sympatry with other members of the species complex and has also been documented in Somalia (6).To date however, no record of its presence in Kenya is available.Similar to other vector species within the Anopheles gambiae s.l.species complex, the distribution and role of An. coluzzii in malaria transmission as well as the development of insecticide resistance varies greatly in different settings (7)(8)(9).
The heterogeneities with respect to ecological characteristics and trophic habits of members of the An.gambiae s.l.species complex have allowed the expansion of its range and contributed to its success in malaria transmission (10,11).Compared to An. gambiae which prefers to breed in unpolluted environments typical of rural areas, An. coluzzii possess a greater capacity to survive in ecologically complex environments characterised by the presence of a variety of stressors.An. coluzzii is likely to have a greater resistance to desiccation because it predominates in arid regions (12,13).Studies have shown An. coluzzii to have greater tolerance to salinity as well as xenobiotics and ammonia pollutants in larval habitat compared to An. gambiae, enabling extension of its range beyond traditional rural settings to densely urbanised settings (14)(15)(16).With the trends in urban migration seen in many African countries expected to continue and unplanned human settlements in urban settings presenting a potential risk factor of increased malaria transmission (17), comprehensive vector surveillance is crucial.Such surveillance will improve the understanding of vector bionomics thereby enabling the implementation of targeted and effective vector control interventions (18)(19)(20).These objectives are in line with the WHO Global Vector Control Response 2017-2030 strategy's recommendation of strengthening national surveillance systems and integration with health information systems to guide vector control and effectiveness (21).
Most entomological studies and vector surveillance efforts, including those in Kenya tend to focus on the historically known primary vectors of malaria to the exclusion of unanticipated or novel vector species.Recently however, the potential role of secondary vectors in malaria transmission has been highlighted aided by the use of molecular identi cation tools (22,23).The current study used whole genome sequence (WGS) data to investigate the taxonomic relationship of samples collected across ve counties in Kenya with different epidemiological parameters.This investigation describes the rst report on An. coluzzii in Kenya while seeking to characterise their population structure and insecticide resistance pro les.

Mosquito sampling, identi cation and rearing
The study utilised archived mosquitoes collected from previous studies in addition to samples collected between December 2019 and February 2021 from ve study sites (Fig. 1).These were i) Teso in Busia County ii) Kwale in Kwale County iii) Kakuma in Turkana County iv) Mwea in Kirinyaga County and v) Thika in Kiambu County (Table ).These ve locations represent different ecological and malaria epidemiological zones within Kenya (24).Teso on the western border of Kenya with Uganda is within the lake endemic zone.Kwale in the southeast is within the coastal endemic zone.Turkana in the northwest is relatively arid and within the seasonal transmission zone.Mwea and Thika are within the central highlands with Mwea being in the seasonal transmission zone and Thika in the low-risk zone.
The gure shows the sampling locations in Kenya in relation to Plasmodium falciparum prevalence rate in 2015 standardised to the age group 2 to 10 years using data obtained from the malariaAtlas R package (25).
For both sets of mosquitoes, Anopheles gambiae s.l.larvae identi ed based on morphology (26) were collected, using standard larval 350 ml dippers, from multiple breeding sites to minimize chances of sampling siblings.The larvae were transported to the laboratory for rearing to adults (except for Turkana where rearing was carried out in the eld due to the long distance to the laboratory).Larvae were reared in water collected from the larval sampling site or dechlorinated tap water at temperatures between 28°C − 31°C and humidity between 80% − 85% and fed on nely ground Sera Vipan staple diet™ (Sera, Germany) sh food.The resultant adult mosquitoes were analysed using WGS.

Sequencing and SNP calling
Sequencing and single nucleotide polymorphisms (SNP) calling was performed following the Ag1000G phase 3 project protocol.Brie y, paired-end multiplex libraries were prepared using Illumina's DNA preparation protocol with fragmentation using Covaris Adaptive Focused Acoustics.Multiplexes of 12 tagged individual mosquitoes were sequenced in three replicates using Illumina HiSeq 2000 and the Illumina HiSeq X technologies.Reads were aligned to the AgamP4 reference genome using BWA version 0.7.15 and indel realignment and SNP calling performed using GATK version 3.7.0.Quality control lters applied included the exclusion of individuals with median coverage < 10X, with no coverage across > 50% of the reference genome, or samples identi ed as crosscontaminated by a percentage of ≥ 4.5% using the protocols set out by the AG1000G project.Only technical replicates with the best sequencing coverage were retained.Additionally, site lters de ned by the Ag1000G project were applied to exclude sites where SNP calling and genotyping was less reliable because the observed genotypes were not consistent with Mendelian inheritance in laboratory crosses.

Taxonomic assignment
To investigate taxonomic status, individual mosquitoes were assessed against two sets of ancestry-informative markers (AIMs) used to distinguish An. gambiae from its sister taxa An. coluzzii and An.gambiae/An.coluzzii from An. arabiensis using publicly available data from the Anopheles 16 genomes project (27)(28)(29).The dataset, described by the Anopheles gambiae 1000 Genomes project (30), includes a set of AIMs SNPs informative in distinguishing taxa because they are exclusive to each taxonomic group discounting multiallelic sites and those with missing data.In total, 2612 and 700 AIMs were used to differentiate An. gambiae/An.coluzzii from An. arabiensis and An.gambiae from An. coluzzii, respectively.Individuals assessed against AIMs for distinguishing An. gambiae from An. coluzzii were called as An.gambiae when the fraction of coluzzii-like calls was < 0.12 and An.coluzzii where this fraction was > 0.9.Individuals assessed against AIMs distinguishing An. gambiae/An.coluzzii from An. arabiensis were called as the latter when the fraction of arabiensis-like alleles was > 0.6.Individuals in-between these fractions represent other taxa.

Population Structure
To compare the genomic composition of An. coluzzii in Kenya with other An.coluzzii cohorts, we performed a Principal Component Analysis (PCA) dimensionality reduction on the allele counts of 100,000 biallelic SNPs equally distributed across chromosome three.Chosen SNPs had a minor allele frequency greater than 0.2% and no missing data.Using the same criteria for SNP selection, we then compared the evolutionary relationships of African An. coluzzii by constructing an unrooted Neighbour-Joining tree with a cityblock distance metric.To determine whether An. coluzzii across Africa are connected to An. coluzzii in Kenya, we computed genomic differentiation between populations using Hudson's pairwise FST (31).To further investigate whether An. coluzzii in Kenya have a similar demography to other An.coluzzii, we calculated informative summary statistics including Nucleotide diversity (), Watterson's theta (θ W ) and Tajima's D. All analyses were performed using the open source and freely available malariagen_data python package.

Insecticide Resistance
To investigate whether An. coluzzii in Kenya have target site mutations associated with insecticide resistance similar to other African An. coluzzii, we used the malariagen_python package to calculate amino acid substitution frequencies based on the occurrence of non-synonymous SNPs at genomic sites of interest.These included the gene targeted by pyrethroid insecticides, the voltage-gated sodium channel (Vgsc; AGAP004707), the glutathione S-transferase gene conferring resistance to DDT (Gste2; AGAP009194), the Resistance to dieldrin gene (Rdl; AGAP006028) and the organophosphate target gene, acetylcholinesterase (Ace1; AGAP001356).To account for sequencing error and remove substitutions unlikely to be under selection, only amino acid substitutions present at a frequency greater than 5% in at least one population were retained.

Population sampling and sequencing
A total of 1,130 individual mosquitoes were collected during this study from ve locations in Kenya (Fig. 1; Table ).All locations included relatively recent sampling (2019-2021) and samples from earlier collections (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) were also available for Mwea, Teso and Turkana.A total of 744 mosquitoes with good quality extracted DNA were submitted for WGS, of which 564 passed all data quality control lters, with an average median coverage of 36 X and minimum of 10 X.After alignment to the AGAMP4 reference genome, we discovered a total of 83,052,633 SNPs segregating within the samples from this study, of which 43,701,680 passed all site quality lters previously established by the Anopheles gambiae1000 Genomes Project phase 3 (30)

Taxon assignment
Taxon assignment within the An.gambiae complex is challenging because taxa are morphologically indistinguishable, and conventional genetic markers are based on a single locus that does not always re ect the ancestry of the rest of the genome (28, 29).All samples in this study were morphologically identi ed as Anopheles gambiae s.l., then the genomic data used to investigate the species.Using a set of ancestryinformative markers (AIMs) previously ascertained from samples with known species status (28,29) we identi ed 498 samples as Anopheles arabiensis (arabiensis AIM fraction > 0.85; Supplementary Figure 1).Of the remaining samples, 37 were identi ed as Anopheles gambiae (coluzzii AIM fraction < 0.1) and 26 were identi ed as Anopheles coluzzii (coluzzii AIM fraction > 0.9).To provide additional con rmation of taxonomic status, we performed a principal components analysis (PCA) and constructed a neighbour-joining tree (NJT) using genomic data from this study together with samples from other African countries from the Anopheles gambiae 1000 Genomes Project and the study of Fontaine et al. (29)(Supplementary Figure 2).These analyses showed a clear grouping by species, with the position of Kenyan An. gambiae, An. arabiensis and An.coluzzii samples entirely consistent with the AIM results.On comparison of data originating from multiple taxa (29), a further 3 individuals that could not be identi ed via AIMs were con rmed to be An.quadriannulatus based on the PCA and NJT.
Anopheles coluzzii populations in West Africa are commonly found to have experienced adaptive introgression of genetic material from An. gambiae within a genomic region towards the centromere of chromosome arm 2L, driven by selection for pyrethroid target-site resistance alleles (34)(35)(36).AIM pro les revealed the majority of An. coluzzii from Kenya are either homozygous (38%) or heterozygous (46%) for introgression from An. gambiae towards the centromere of 2L (Supplementary Fig. 1).
Anopheles coluzzii has not previously been reported in Kenya but is a highly competent malaria vector in West and Central Africa.The species were detected across three different sampling time points at frequencies of 20.0% (5 out of 25) in month 2 of 2009, 20.3% (13 out of 64) in month 1 of 2019 and 6.1% (8 out of 131) in month 9 of 2019.Given the importance of this nding for malaria vector surveillance and control, we focus the remainder of this report on a full characterisation of the Kenyan An. coluzzii.Analysis of genomic data from the other Anopheles taxa sequenced in this study will be reported separately.

Geographical population structure and genetic diversity
To explore the genetic relationship between the Kenyan An. coluzzii and conspeci c populations from other countries, we combined data from this study with previous sequence data of An. coluzzii populations from inland West Africa (Burkina Faso, Mali), coastal West Africa (Côte d'Ivoire, Ghana), and Central Africa (Cameroon, the Central African Republic, Angola) (37).We used SNPs from Chromosome 3, which is free from polymorphic inversions, to perform a principal components analysis (PCA), compute a neighbour-joining tree (NJT) and quantify the degree of allele frequency differentiation (FST) between cohorts from different locations.The PCA and NJT analyses grouped the Kenyan An. coluzzii most closely with An. coluzzii from inland West Africa (Mali, Burkina Faso) and northern Cameroon (Fig. 2).The FST results were consistent with these analyses, nding the lowest FST between Kenya, Mali and Burkina Faso (FST 0.006-0.007for both comparisons; Supplementary Figure S3).These results show a lack of strong population structure between An. coluzzii from Kenya and more northerly regions of West and Central Africa.
The gure shows the analysis of geographical population structure within An. coluzzii, comparing samples from Turkana, Kenya collected in this study with reference samples from the Ag1000G project and Fontaine et al. (29).Kenyan An. coluzzii are most closely related to An. coluzzii from inland West Africa (Mali, Burkina Faso) and Northern Cameroon.a, PCA.b, NJT.
To further explore whether Kenyan An. coluzzii share a genomic pro le similar to northern West and Central Africa, we investigated whether they have the same 2La and 2Rb inversion karyotype by performing a PCA targeting these regions of the genome.An. coluzzii from Kenya have the same 2La/2Lb karyotype found in An. coluzzii from Burkina Faso and Mali only, supporting the nding that they are genetically most similar to these populations (Supplementary Figure S4).To investigate whether the Kenyan An. coluzzii have a similar demographic history to An. coluzzii populations from other countries in Africa, we computed genetic diversity summary statistics for mosquito cohorts grouped by geographical region and year of sampling.Nucleotide diversity, the density of segregating sites (Watterson's theta) and allele frequency spectra (Tajima's D) in Kenyan An. coluzzii were similar to An. coluzzii from West Africa (Nucleotide diversity, 0.026; Watterson's theta, 0.033; Tajima's D, -0.961; Supplementary Figure S5), suggesting lack of genetic isolation.

Insecticide resistance
To investigate whether Kenyan An. coluzzii have mutations associated with target-site resistance to insecticides, we computed amino acid allele frequencies across four genes encoding insecticide binding targets: Vgsc (AGAP004707), Gste2 (AGAP009194), Rdl (AGAP006028) and Ace1 (AGAP001356).We found that An. coluzzii in Kenya have the Vgsc-L995F substitution associated with resistance to DDT and pyrethroids, at 62% frequency (Fig. 3a)(38).This allele is also found at high frequency in An. coluzzii populations throughout West and Central Africa.In addition, Kenyan An. coluzzii displayed the double mutant Vgsc-V402L + I1527T substitution at 38% frequency.This double mutant is also present in other An.coluzzii populations.Additionally, Kenyan An. coluzzii have the Gste2-I114T mutation at high frequency (67%), which confers metabolic resistance to DDT (39) (Fig. 3b) and also observed at high frequency in An. coluzzii populations from West and Central Africa.
Two haplotypes have been previously associated with resistance to dieldrin in An. coluzzii, Rdl-296G/345M widely distributed across Africa and Rdl-296S/345S previously only reported from Burkina Faso (40).We observed the 296/345 substitution pair in Burkina Faso and also in Mali and Kenya, but the frequency was very low, ~ 5%, in An. coluzzii from Kenya (Supplementary Fig. 6).A mutation in the Acetylcholinesterase gene, Ace1-G280S, was previously linked to organophosphate and carbamate resistance (41).Although the mutation is present in An. coluzzii from coastal West Africa (i.e., Côte d'Ivoire and Ghana), it was not present in An. coluzzii from Kenya (Supplementary Fig. 7).

Discussion
A proper understanding of vector bionomics within the context of the malaria transmission system is important for the choice and successful implementation of vector control interventions.The current study sought to improve this understanding by analysing the distribution of An. gambiae s.l.sibling species from ve sites in Kenya, one each in the different malaria epidemiological zones.Based on analysis of whole-genome data we report, for the rst time in Kenya, the presence of Anopheles coluzzii.
Anopheles coluzzii is a highly competent malaria vector in West and Central Africa.All the An.coluzzii samples found in this study were collected from Turkana in Northwest Kenya.This included mosquitoes collected across multiple years and months, con rming the presence of An. coluzzii in the region since at least 2006 (Table ).Past failure to detect An. coluzzii in Kenya is likely associated with the fact that malaria vector surveillance and identi cation has tended to focus on historically known primary vectors using species-speci c markers.For example, in Turkana, a past study reported all mosquitoes identi ed as An.arabiensis but only 85% of samples reacted in PCR diagnostic tests (42).Failure to identify specimens in such molecular assays is usually explained away as failure in the assay itself due to factors such as poor-quality DNA or other associated factors.Our analysis exploited WGS allowing the use of a large number of markers distributed across the entire genome to detect cryptic and/or unknown species demonstrating how whole genomic sequencing can be con gured to support routine surveillance to accurately identify present and shifting vector distributions, adaptive evolutionary changes and populations structuring, integral to understanding of malaria transmission dynamics (42).
When the Kenya An. coluzzii were compared with populations from the northerly regions of West and Central Africa, they showed a lack of strong population genetic structure evidenced by clustering on the PCA and NJT and low FST, suggesting relatively unrestricted gene ow across a northerly belt spanning continental Africa.In support of this notion, Kenyan An. coluzzii had a similar genetic diversity to other West and Central African populations.Exploration of the 2La/2Lb inversion karyotypes, which has previously been linked to aridity and increases in frequency with this cline (43,44) revealed that Kenyan An. coluzzii have the same inversion karyotype found in Burkina Faso and Mali at both loci, supporting the nding of genetic similarity to these populations.Our nding that the 2La/2Rb karyotype is the same as that found in arid West and Central Africa is consistent with its environmental association, since Turkana in Kenya is arid to semi-arid, experiencing high temperatures and highly seasonal rainfall.The presence of An. coluzzii has also been documented in Somalia, which shares a similar arid to semi-arid ecosystem (6).Our ndings suggest Kenyan An. coluzzii are not an isolated population, nor have they experienced any recent bottlenecks or other distinct demographic events.
Investigation of the insecticide resistance pro les of Kenyan An. coluzzii revealed the presence of the Vgsc-L995F substitution associated with resistance to DDT and pyrethroids (38) at 62% frequency and the Gste2-I114T mutation which confers metabolic resistance to DDT (39) at a frequency of 67%.These alleles are also found at high frequency in An. coluzzii populations from West and Central Africa (45)(46)(47).In addition, Kenyan An. coluzzii displayed the double mutant Vgsc-V402L + I1527T substitution at 38% frequency also present in other An.coluzzii populations and recently observed to be increasing in frequency in An. coluzzii in Burkina Faso (48).The V402L substitution has been functionally validated to confer insecticide resistance with reduced tness cost to the mosquito when compared to L995F (49) and the fact that V402L is almost exclusively found in combination with I1527T suggests a strongly synergistic effect that may further increase tness in the presence of insecticides.
Previous studies have documented resistance to pyrethroids in An. gambiae s.l. in Kakuma (Turkana County) with enzymatic resistance being implicated (50).Both ITNs and IRS have historically been used to prevent malaria in the area.Our nding of mutations associated with resistance to insecticides is therefore not surprising and ts with the picture of widespread and increasing insecticide resistance in the country (51).The presence of the V402L + I1527T substitution in Kenya is concerning, because variants conferring stronger pyrethroid resistance could compromise the e cacy of new pyrethoid + PBO LLINs, currently considered among the most effective defense against resistant populations (52).
Increased urbanisation promotes retention of surface water and is associated with water distribution and drainage systems that provide suitable habitat for Anopheles mosquitoes.The refugee camp from which the samples were collected is a highly populated region that experiences low level local transmission and has recently suffered recurrent epidemics (42).Historically, the area has received little attention regarding malaria intervention, because the climate was considered unsuitable for known vector species, except An. arabiensis, considered a less competent vector.

Conclusion
The observation that An. coluzzii, an e cient vector of malaria which can tolerate more arid conditions and thrive in both rural and urban settings occurs in Kenya but has not been reported to date means that this vector is potentially contributing to malaria transmission in Turkana County and malaria control interventions currently in place may be ineffective against it.This nding alongside the recent nding of An. stephensi in Turkana and other regions of northern Kenya emphasizes the need for re-evaluation of the distribution, bionomics and epidemiological signi cance of the local vector populations in the country.This is the only way the country will be able to ensure vector control approaches are su ciently targeted at the myriad of Anopheles vectors responsible for transmission in the different settings in Kenya. Figures

Figure 1 Map
Figure 1

Figure 2 Population
Figure 2

Table
Sampling locations in Kenya and the number of samples successfully sequenced, grouped by taxon