Speciation is a fundamental biological process that involves the evolution of reproductive isolation between diverging populations. Whilst the general processes of geographical isolation, genetic divergence and evolution of post-mating reproductive isolation characterising allopatric speciation are better known , those involved in sympatric speciation, the emergence and divergence of two gene pools within a panmictic population, are still poorly understood [2, 3]. Theoretical models have shown that this requires genes of local adaptations to become associated with those of mate choice [4–7]. In the face of ongoing gene flow, this association is thought to occur only under a restricted set of genetic and environmental conditions. Under some circumstances, strong divergent natural selection might negate the effect of gene flow at very small spatial scales and could be the primary driver of speciation [1, 8]. Features of genomes such as chromosomal inversions, peri-centromeric regions and other regions characterized by reduced recombination can also facilitate the association of local adaptation and assortative mating loci [2, 5–7]. In addition, sex chromosomes are thought to promote the rapid accumulation of pre and post-mating isolation genes due to their hemizygosity and lower recombination rate . Currently, teasing out the genomic signature of the onset of speciation from the genomic processes that follow the establishment of intrinsic post-mating reproductive isolation constitutes a major challenge and there are few natural model systems that allow it [2, 10, 11].
The Anopheles gambiae complex comprises a number of important vectors of human malaria in Africa that are separated by various degrees of reproductive isolation [12, 13]. Amongst its different cryptic taxa, two sibling species, An. coluzzii and An. gambiae s.s., formerly known as M and S molecular forms  are undergoing speciation with gene flow and may provide the ideal model system for studying the genomic signature of pre-mating isolation independent of intrinsic post-mating isolation speciation processes . The two sibling species interbreed freely in the laboratory and do not exhibit intrinsic post-mating barriers to reproduction in the form of hybrid inviability or sterility [16, 17]. In studies based on natural sympatric populations from Central and Eastern West Africa, hybrids were typically found to be uncommon or rare [18–20]. A longitudinal study conducted over a period of two decades in Mali shows that the reproductive isolation between M and S is unstable. A strong assortative mating that sustained the maintenance of the two populations, is periodically disrupted by episodes of hybridization, indicating temporal variation in hybridization rates  due to the decreases of reproductive isolation followed by selections against hybrids. However, surveys undertaken in coastal areas West Africa in Senegal and The Gambiae have uncovered hybridization zones where hybrids frequencies between An. coluzzii and An. gambiae s.s. as high as 24% can be observed [21–23].
In the absence of intrinsic post-mating reproductive barriers between the sibling species, their genetic identity is thought to be maintained through the combined effects of strong assortative mating, as evidenced by studies conducted in Mali and Burkina Faso [18, 24], and through extrinsic post-mating barriers to reproduction in the form of decreased hybrid fitness. Whilst the later has yet to be demonstrated experimentally, evidence of ecological divergence between the two sibling species has accumulated, suggesting that hybrids could incur fitness costs in a variety of ways. The two sub-taxa have been shown to differ in preferred larval breeding sites [26–28] and ability to detect and avoid aquatic predators [29–32]. Entomological surveys and laboratory experiments have also shown that Sub-Sahelian populations of An. coluzzii exhibit higher tolerance to desiccation stress and dominate vector populations in drier seasons and habitats [33–35]. In the same regions, the sibling species also differ in their aestivation strategies [36, 37] affecting their survival and/or migration strategy during the dry season. However, despite these differences, the sibling species co-exist in vast areas of West Africa and are characterized by similar levels of anthropophily and endophily.
Both sibling species mate in mating aggregations called swarms that are initiated at dusk usually within villages . However, they tend to differ in the height at which they form swarms and the type of ground markers they use . Such swarm spatial segregation has been observed in Burkina Faso and Mali and may play a major role in assortative mating between the sibling species [39–41]. However, mixed swarms are sometimes found at low frequencies [39, 42] suggesting that short-range mate recognition mechanisms must also be involved in assortative mating [38, 43]. One such mechanism could be flight-tones harmonic convergence [44, 45] whilst the possible role of contact pheromones remains to be adequately explored [38, 46].
That hybridization results in gene flow between the two sibling species in spite of strong assortative mating has been supported by genome-wide population genetic studies first based on Short Tandem Repeat (STRs) markers [47, 48] and next on Single Nucleotide Polymorphisms (SNPs). The former revealed a mosaic pattern of genetic differentiation with most of the genome lacking differentiation because of ongoing gene flow and limited areas of areas of the genome seemingly protected from recombination [47, 48]; the later identified those regions more precisely through their higher marker density [49, 50]. These revealed three highly genetically-differentiated regions located near the centromeres of chromosomes X, 2L and 3L suggesting that sympatric speciation in these incipient species probably involved the divergence of such ‘islands of speciation' possibly containing clusters of speciation genes and protected from gene flow through recombination suppression [49, 50]. As of today, these three regions are the only regions that have been detected by high-density genome scans across the whole sympatric range of the two sibling species [23, 51–54]. That current hybridization plays a role in the speciation process is further supported by recent studies that used Divergent Island SNPs (DIS) genotyping to distinguish heterozygous F1 hybrids from the recombinant genotypes of F1 + n backcrosses in sympatric populations . Studies based on DIS confirmed the occurrence of varying levels of hybridization across sympatric populations of the sibling species translating into varying degrees of introgression [53, 54]. As an example, a recent introgression event occurred in Burkina Faso and Mali in which the entire 2L speciation island passed from An. gambiae s.s. into An. coluzzii through hybridization and selection resulting in the transfer of important pesticide resistance loci between the two species [51, 54, 56]. Interestingly, the DIS data also suggests that these populations are now regaining their specific X, 2L and 3L pericentric regions, highlighting the fact that selection does act against recombinant genotypes. Selection against F1 + n backcrosses would also explain how pericentromeric islands of speciation remain significantly associated in the hybrid zones of Western coastal West Africa despite high levels of gene flow [21–23].
Whist the extent of genetic differentiation in other areas of the genome and how much of it is due to selection for local adaptation is currently debated, the X, 2L and 3L speciation islands seem to have played and still appear to be playing a crucial role in the genomic structure of speciation in these species. The close association between the X speciation island and pre-mating isolation genes was recently experimentally demonstrated by swapping the X-island of An. coluzzii with that of An. gambiae s.s. through multigenerational selective introgression . Females from the resulting recombinant strains differing only at their X-chromosome island strictly mated with males which had matching island type . Because assortative mating was female driven and occurred in small cages, it highlighted a short-range mechanism, possibly involving male-expressed specific cues and female choice, and independent of the process of swarm spatial segregation.
Given the importance of spatial swarm segregation for assortative mating in natural sympatric populations of the sibling species, one would expect genes for swarming site preference to be also associated with one of the pericentromeric islands of divergence. In this study, for the first time, this association was formally tested using thousands of males and female from An. coluzzi and An. gambiae s.s. directly collected from natural swarms in sympatric populations from Burkina Faso. DIS  was then used to genotype all individuals. Thus, in contrast to what is commonly done, and to avoid potential bias due to circularity, An. coluzzii and gambiae swarms were described using individual DIS genotypes rather than rely solely on one species diagnostic based on a X-linked locus . Thereafter, the association between the genotypes of non-hybrid individuals and 1st generation (F1) and 1 + nth generation hybrid male and females (F1 + n) at the X, 2L and 3L islands were associated with swarm spatial segregation. The results demonstrate the close association between the X-island and spatial swarm segregation initiated by males. In addition, there was a strong but significantly weaker association with the 3L island. The association between swarm type and the 2L island was broken down by recent adaptive introgression from An. gambiae s.s. to An. coluzzii. These results lend further support to an island-of-speciation mode of sympatric speciation in which core assortative mating and divergent ecological adaptation genes are genetically linked and protected from gene flow through pericentric recombination suppression.