Population suppression of the malaria vector Anopheles gambiae by gene drive 1 technology: A large-cage indoor study bridging the gap between laboratory and field 2 testing 3

21 CRISPR-based gene drives are self-sustaining genetic elements that have been recently generated in 22 the laboratory with the aim to develop potent genetic vector control measures targeting disease 23 vectors including Anopheles gambiae . We have shown that a gene drive directed against the gene 24 doublesex (dsx) effectively suppressed the reproductive capability of mosquito populations reared in 25 small laboratory cages. These experiments, though informative, do not recapitulate the complexity of 26 mosquito behaviour in natural environments. Additional information is needed to bridge the gap 27 between laboratory and the field to validate the vector control potential of the technology. 28 We have investigated the suppressing activity of dsx gene drive strain Ag(QFS)1 on age-structured 29 populations of Anopheles gambiae in large indoor cages that provide a more challenging ecology by 30 more closely mimicking natural conditions and stimulating complex mosquito behaviours. Under 31 these conditions, the Ag(QFS)1 drive spreads rapidly from a single release to the indoor large-cage 32 populations at low initial frequency, leading to full population suppression within one year and 33 without inducing resistance to the gene drive. 34 Initial stochastic simulations of the expected population dynamics, as based on life history parameters 35 estimated in small cages, did not fully capture the observed dynamics in the large cages. Thus, we 36 used the method of approximate Bayesian computation to better estimate population dynamics in the 37 more realistic ecological setting in large cages, allowing the mosquitoes to show a complex feeding 38 and reproductive behaviour. 39 Together, these results establish a new paradigm for generating data to bridge laboratory and field 40 studies, and form an essential component in the stepwise and sound development of gene drive based 41 vector control tools. 42

the laboratory with the aim to develop potent genetic vector control measures targeting disease 23 vectors including Anopheles gambiae. We have shown that a gene drive directed against the gene 24 doublesex (dsx) effectively suppressed the reproductive capability of mosquito populations reared in 25 small laboratory cages. These experiments, though informative, do not recapitulate the complexity of 26 mosquito behaviour in natural environments. Additional information is needed to bridge the gap 27 between laboratory and the field to validate the vector control potential of the technology. 28 We have investigated the suppressing activity of dsx gene drive strain Ag(QFS)1 on age-structured 29 populations of Anopheles gambiae in large indoor cages that provide a more challenging ecology by 30 more closely mimicking natural conditions and stimulating complex mosquito behaviours. Under 31 these conditions, the Ag(QFS)1 drive spreads rapidly from a single release to the indoor large-cage 32 populations at low initial frequency, leading to full population suppression within one year and 33 without inducing resistance to the gene drive. 34 Initial stochastic simulations of the expected population dynamics, as based on life history parameters 35 estimated in small cages, did not fully capture the observed dynamics in the large cages. Thus, we 36 used the method of approximate Bayesian computation to better estimate population dynamics in the 37 more realistic ecological setting in large cages, allowing the mosquitoes to show a complex feeding 38 and reproductive behaviour. 39 Introduction 43 CRISPR-based gene drives are selfish genetic elements that can be used to modify entire populations 44 of the malaria mosquito for sustainable vector control. First proposed in 2003, these elements use a 45 mechanism of cut and paste ('homing') in the germline to facilitate their autonomous spread from a 46 very low initial release frequency (Burt, 2003;Windbichler et al. 2011). One potentially powerful 47 Initially, we assessed life history traits of both Ag(QFS1) males and females as well as of the wild-113 type strain G3 of Anopheles gambiae and assessed their longevity under large cage conditions (4.7 114 m 3 ) in order to emulate more natural population dynamics (Pollegioni et al. 2020) (see Fig. 1, 115 Supplement Material). Considering the initial Kaplan-Meier Survival estimate of wild-type G3 adult 116 mosquitoes in 4.7 m 3 cages of 2 m x 1 m x 2.35 m size and the establishment of overlapping 117 generations with bi-weekly introductions of 400 G3 pupae with a start-up population of 800 118 mosquitoes, we then analysed age-structured large cage (ASL) populations with an expected mean 119 size of ~570 adult mosquitoes as 'receiving' populations for gene drive release experiments. To 120 mimic field-like conditions absent in small cage conditions, the climate chambers were maintained 121 under near-natural environmental conditions including simulated dusk, dawn and daylight, and each 122 cage was equipped with proven swarming stimuli and a resting shelter (Facchinelli et al. 2015) (Fig.  123 1). Under these conditions male swarming, an important component of successful mating behaviour, 124 was regularly induced. To mimic a hypothetical field gene drive release, we seeded Ag(QFS1) 125 mosquitoes over a single week (two releases) into the established 'receiving' wild-type populations at 126 two different starting frequencies, low (12.5% initial allele frequency) and medium (25% allele 127 frequency), as well as control cages (0% gene drive release), all in duplicate (6 cages total). The ASL 128 population dynamics and the potential selection of drive-resistant alleles were monitored in treated 129 and control cages until wild-type populations were fully suppressed by the gene drive in the 130 treatments. Finally, we constructed an individual-based stochastic simulation model of the experiment 131 to better understand the observed dynamics of the gene drive frequency and population suppression. 132

Mosquito strains 133
Two Anopheles gambiae mosquito strains were used, the wild-type G3 strain (MRA-112) and Female 134 Sterile Gene Drive strain, Ag(QFS)1, previously known as dsxF CRISPRh (Kyrou et al. 2018). This strain 135 contains a Cas9-based homing cassette within the coding sequence of the female-specific exon 5 of 136 the dsx gene (Supp. Fig. 1). The cassette includes a human codon-optimised Streptococcus pyogenes 137 (BugDorm-4) as described in Valerio et al. (2016) at 28°C and 80% relative humidity (Suppl. Fig. 2). 146 Larvae were maintained in trays (253 x 353 x 81 mm) at a density of 200 larvae per tray using 400 6 mL deionized water with sea salt at a concentration of 0.3 g/L and 5 mL of 2% w/v larval diet 148 (Damiens et al. 2012) and screened for fluorescent markers en masse using a Complex Object 149 Parametric Analyzer and Sorter (COPAS, Union Biometrica, Boston, USA). 150

Large cage environment 151
For experimental purposes, mosquitoes were housed in a large cage environment as described in 152 Pollegioni et al. (2020). A single large climatic chamber was equipped with six 4.7 m 3 cages of 2 m x 153 1 m x 2.35 m (length, width, height) ( Fig. 1) and maintained at 28 o C ±0.5°C and 80% ±5% relative 154 humidity ( Fig. 1, Suppl. Fig. 2). The climatic chamber was illuminated by three sets of three LEDs 155 (3000K, 4000K and 6500K correlated colour temperatures) controlled by Winkratos software 156 (ANGELANTONI Industries S.p.A, Massa Martana, Italy), allowing a gentle transition between light 157 and dark sufficient to emulate dawn, and dusk. For the purpose of the current study, full light 158 conditions (800 lux) were simulated using all LEDs and adjusted to last 11 hours and 15 minutes. 159 Cages were additionally equipped with ambient lighting (3000K) designed to stimulate swarming, as 160 described previously in Facchinelli et al. (2015), and a terracotta resting shelter moistened with a 161 soaked sponge. Mosquitoes were fed on 10% sucrose and 0.1% methylparaben solution and blood-fed 162 bi-weekly using defibrinated and heparinized sterile cow blood via the Hemotek membrane feeder 163 (Discovery Workshops, Accrington, 34 UK). Oviposition sites consisted of a 12 cm diameter Petri 164 dish with a wet filter paper strip introduced 2 days after the blood meal. Mosquito pupae, food, blood 165 and water were introduced or removed through two openings, 12 cm in diameter, at the front of each 166 cage with no operators entering the cage. No adult mosquitoes were removed from the large cages 167 throughout the cage trials. 168

Measuring the life history parameters 169
To assess life history parameters of wild-type G3 and Ag(QFS)1 strains, standardized phenotypic 170 assays were performed as described in Pollegioni et al. (2020). In brief, clutch size, hatching rate, 171 larval, pupal and adult mortality rates, as well as the bias in transgenics among the offspring of 172 heterozygous Ag(QFS)1 were measured in wild-type G3 and Ag(QFS)1 strains in triplicate in 173 standard small laboratory cages (BugDorm-4). Ag(QFS)1 heterozygotes used in these assays had 174 inherited the drive allele paternally and were therefore subject to paternal, but not maternal, effects of 175 embryonic nuclease deposition that can lead to a mosaicism of somatic mutations at the doublesex 176 locus and a resultant effect on fitness (Kyrou et al. 2018). 150 females and 150 males were mated to 177 wild-type mosquitoes for 4 days, blood-fed, and their progeny counted as eggs using EggCounter v1.0 To test the suppressive potential of Ag(QFS)1, we first established stable ASL populations of An. 195 gambiae (G3 strain) housed in a purpose-built climatic chamber. Each population was initiated and 196 maintained at the maximum rearing capacity through bi-weekly introductions of 400 G3 pupae (200 197 males and 200 females) over a period of 21 days ('establishment'), estimated to sustain a mean adult 198 population of 574 mosquitoes based on the initial Kaplan-Meier estimate (Suppl Fig. 3a). After this 199 initial period only progeny of these populations were used to repopulate the cages twice weekly 200 ('restocking') for a period of 53 days ('pre-release', 74 days total), or supplemented with wild type 201 reared separately when progeny numbers were too low. Each ASL population was considered 202 stabilised after retrieving a sufficiently large and stable number of eggs to restock the population over 203 four consecutive weeks. In detail, the receiving populations in all six cages were stabilised to produce 204 a similar number of eggs in the 31 days before Ag(QFS)1 release, with an average egg production per 205 cage ranging from 2262-5334. Bi-weekly blood meals were initiated at dusk and extended for a period 206 of 5 hours, and oviposition sites were illuminated with blue light for egg collection 2 days later. Eggs 207 were removed from the cages, counted, and allowed to hatch in a single tray within the climatic test 208 chamber. For re-stocking the cage populations with wild-type pupae, a maximum of 400 randomly 209 selected pupae were collected at the peak of pupation, manually sexed and screened, and introduced to 210 their respective cage twice per week. 211

Ag(QFS)1 release experiments in large cages 212
To assess invasion dynamics of the Ag(QFS)1 strain in ASL populations of Anopheles gambiae, we 213 performed duplicate releases designed to randomly seed ASL populations at low ( week (943 and 1085, respectively), equivalent to 25% or 50% of the estimated mean pre-released 219 adult population (on average 574 mosquitoes were present in large cages). No further releases were 220 carried out and indoor ASL populations were maintained through restocking of 400 pupae twice per 221 week. From then, the ASL populations were maintained in the same way we established the receiving 222 population, with the same constant re-stocking rate from offspring. No adult mosquitoes were 223 removed from the cages. Duplicate control cages were similarly maintained, but without release of 224 Ag(QFS)1. 225 While not statistically significant (Kruskal-Wallis Test P = 0.06 ns ), there was some variation in 226 reproductive output amongst the six cages due to random effects (cage 1: mean egg number = release cages were distributed to cages 2 and 5 (12.5% allelic frequency) and cages 3 and 6 (25% 234 allelic frequency) to mitigate against potential local environmental position effects (Fig. 1). 235 Key indicators of population fitness and drive invasion were monitored for the duration of the 236 experiment, including total egg output, hatching rate, pupal mortality, and the frequency of 237 transgenics amongst L1 offspring and the pupal cohorts used for restocking. Total larvae were 238 counted and screened for RFP fluorescence linked to Ag(QFS)1 using the COPAS larval sorter, and 239 1000 randomly selected to rear at a density of 200 per tray. Pupae positive for the gene drive element 240 could be identified by expression of the RFP marker gene that is contained within the genetic element. 241 Triplicate samples of up to 400 L1 larvae were stored in absolute ethanol at -80°C for subsequent 242

analysis. 243
Modelling 244 A stochastic model was set up to replicate the experimental design with respect to twice-weekly egg-245 laying, the initiation phase, the transgene introductions, and the subsequent monitoring phase. A full 246 model description is given in the Supplementary Methods. In brief, daily changes to the population 247 result from egg laying, deaths, and matings, and are assumed to occur with probabilities that may be 248 genotype specific. Adult longevity parameters were estimated from the large cage survival assays that 249 were performed before the gene-drive release experiments began, and after the gene-drive dynamics 250 had run their course. We compared the data to model simulations using a suite of summary statistics 251 (Csilléry et al. 2010; Supplementary Methods) to infer three parameters representing female fertility 252 costs associated to the drive allele. In addition, we inferred two parameters that determined the egg 253 production of unaffected (wildtype) females, and one parameter that determined the rate of R2 allele 254 creation. We obtained a posterior distribution for all six parameters by retaining the 200 best fitting 255 parameter combinations from 200,000 parameter samples generated by a Monte-Carlo algorithm (Fig.  256 Suppl 4, Table 1). 257

Pooled amplicon sequencing and analysis 258
We previously developed a strategy to detect and quantify target site resistance based upon targeted 259 amplicon sequencing using pooled samples of larvae (Hammond et al. 2017), and found no evidence After stabilising the receiving wild type populations in the large cages, we seeded the cages, in 286 duplicate, with gene drive mosquitoes at 12.5% and 25% allelic frequencies of the estimated pre-287 released adult population size. We also kept two cages unseeded as controls. We were able to track 288 the inheritance of the gene drive allele by virtue of the dominant RFP marker gene. We observed 289 substantial variability in the rise in frequency of gene drive-positive mosquitoes, regardless of starting 290 frequency (Fig. 2g, Fig. 2h, Supplementary Data 1). We also observed an apparent 'phasing' pattern 291 of transgene frequency between consecutive re-stockings, persisting for up to 200 days post release, 292 that could be related to (but not only) the two phased single week releases. The spread of the 293 Ag(QFS)1 followed a sigmoidal pattern of invasion, increasing in frequency slowly for the first 100-294 150 days, followed by a rapid period of invasion, and finally slowing as the drive approached fixation 295 between 220-276 days after introduction in the low frequency release cages (Fig. 2g) and between 296 224-241 days after introduction in the medium frequency release cages (Fig. 2h). No gene-drive 297 positive individuals were detected in control cages, consistent with the cages being fully isolated from 298 one another (Supplementary Data 1). 299 Increase in frequency of the gene drive allele causes elimination of ASL mosquito populations 300 As Ag(QFS)1 approached fixation there was a rapid decline in the fraction of fertile females as the 301 growing proportion of gene drive homozygotes, lacking a functional copy of the female isoform of the 302 doublesex gene, develop into sterile "intersex" adults ( Fig. 2d and Fig. 2e). As the formation of 303 homozygotes is a requirement for population suppression, a strong and unambiguous reduction in egg 304 output occurred only after the frequency of the gene drive allele rose above 90%, culminating in 305 complete elimination 245-311 days after release of Ag(QFS)1 in the low frequency cages (Fig. 2a)  306 and by days 266-276 in the medium release cage (Fig. 2b). By comparison, the mosquito population 307 in the control cages maintained a stable sex ratio (Fig. 2f) and an average of more than 10,000 eggs 308 over the final month of the experiment (Fig. 2c), while cages seeded with Ag(QFS)1 collapsed. 309 Adult longevity increases over the course of the large cage release experiment 310 No significant differences in adult survival between Ag(QFS)1 and wild-type strains were detected in 311 large cages (P = 1.0, Kruskal-Wallis test), with 50% median mortality at day 6 (95% CI = 5-6 days) 312 and day 11 (95% CI wild-type = 9-13 days, 95% CI Ag(QSF)1 = 11-12 days) at the beginning and the 313 end of the large caged release experiment, respectively (Supp. Fig. 3 and Supplementary Data 2). 314 Overall, survival in large cages is substantially lower than in small cages maintained under similar 315 environmental and rearing conditions, where 50% mean mortality occurred at 20 days. In agreement 316 with Pollegioni et al. (2020), our data suggest that females survive longer than males when housed in 317 large cages. 318 We observed an increased adult longevity in the large cages after the year-long experiment compared 319 to before the release (median of 11 days and 6 days, respectively; P = 0.032, Kruskal-Wallis test) 320 irrespectively of the genotype. Individuals reared in the small cages tested in the same conditions 321 (after the year-long experiment) showed the same adult survival than those collected from the ASL 322 populations (for both G3 wild type and Ag(QFS)1 transgenics), suggesting the difference is due to the 323 micro-environmental conditions of the large cages and not due to strain adaptation or the genotypes. 324 Parameter inference reveals drive allele female fertility costs in age-structured mosquito 325 populations 326 The ASL caged populations showed a similar trend of increasing egg output over time prior to the 327 suppressive effect of the drive (Fig. 2a-c) that may be explained by a general increase in adult survival 328 that was observed between the start and end of the population experiment (Supp. Fig. 3). To account 329 for these changes in the stochastic model, we assumed a small increase in adult survival over time, 330 irrespective of genotype, based on experimental data (Supp. Fig 3). The posterior distribution of our 331 stochastic model is summarised in Table 1. We were particularly interested in the drive allele fertility 332 costs, because these are potentially important to drive allele dynamics in natural populations 333 (Beaghton et al. 2019, North et al. 2020. Fertility costs may arise from paternal and maternal effects 334 of Cas9 deposition into the sperm or egg, or from ectopic activity of Cas9 in the soma (Kyrou et al. 335 2018). 336 The full posterior distribution indicated the presence of fertility costs, yet did not allow the relative 337 roles of deposition and ectopic activity to be disentangled; the posterior probabilities for each factor 338 strongly covary (Supp. Fig. 4). We therefore determined posterior estimates of transformed 339 parameters that summarise the fertility costs of transgenic females depending on whether they had a 340 transgenic father, mother, or if both parents were transgenic (Supplementary Methods). 341 The posterior mean density for the fertility cost to transgenic females whose father was transgenic was 342 0.35 (indicating a 35% reduction in egg output relative to wildtype females), with a 95% credible 343 interval of (0.18-0.56) (Fig. 2i). This increased slightly to 0.39 if instead the mother was transgenic, 344 with a much wider credible interval (0.02-0.85), and reduced to 0.18 (0.01-0.36) if both parents were 345 transgenic. The overlap in the parent-specific estimates means we cannot determine whether the sex 346 of the transgenic parent makes a difference to the fertility of transgenic female offspring on the basis 347 of this data. 348 The posterior densities indicated that females typically lay around 117 eggs per batch (54-219), and 349 around 13% of mated females laid eggs at each twice-weekly opportunity (7-20%). The posterior 350 mean density for the fraction of non-homed gametes produced by heterozygous individuals becoming 351 non-functional resistance alleles was around one half (49%; 27-81%). 352

Stochastic simulations capture dynamics of spread and suppression 353
Simulations of the cage dynamics using parameters drawn at random from the posterior distribution 354 gave a close correspondence to the observed trends in the frequency of drive-carrying individuals 355 ( Fig. 2g-h). This is expected, since the posterior distribution was inferred from the data, yet it gives 356 confidence that the model captures much of the biology of the cage population. The simulations 357 performed less well in replicating the variability in egg laying in the control cages, suggesting the 358 model does not incorporate all the sources of this variation (Fig. 2c). We ran 1000 simulations of the 359 posterior informed model to predict the range of potential cage dynamics. All simulations ended with 360 complete population suppression within 560 days, and 95% of the simulations reached this state 361 within 399 or 329 days for the low and high frequency releases, respectively (Supp. Fig. 5). 362 Drive-resistant alleles were not generated in large cage releases of Ag(QFS)1 363 To investigate whether drive-resistant alleles had been generated or selected as the gene drive allele 364 increases in frequency in the populations, we performed pooled amplicon sequencing around the 365 gRNA target site on samples of the larval progeny (150-1200/cage) collected at early and late 366 timepoints after release (Fig. 3). These alleles can take two forms: functional resistant alleles that 367 restore a viable gene product, and non-functional resistant alleles that do not. Resistant alleles may be 368 pre-existing in the population or generated by the gene drive itself as a result of error-prone end-369 joining. In spite of the incredible selective pressure exerted by Ag(QFS)1, no mutant alleles were 370 generated that could conceivably code for a functional DSX protein.

371
We identified three putative end-joining mutations present above the threshold frequency of 0.25% in 372 any of the four release cages. All three alleles introduce a frameshift mutation that would disrupt the 373 female isoform of doublesex, including a 5-bp insertion that was uniquely identified in this study and 374 two deletions (1 bp and 11 bp in length) that were previously identified in small caged testing of 375 Ag(QFS)1 (Kyrou et al. 2018). The failure of any of these alleles to spread above 1% frequency 376 amongst non-drive alleles would suggest they are highly deleterious and undergo no positive selection 377 as the gene drive allele increases in frequency. 378 379

Discussion 380
In this study we provide evidence that the dsx targeting gene drive strain, Ag(QFS)1, is able to 381 effectively suppress age-structured populations reared in an environment that recapitulates some 382 parameters typical of natural conditions and induces some mosquito behaviours observed in the field. 383 This gene drive has previously been demonstrated to spread effectively through populations of wild-384 type Anopheles gambiae mosquitoes maintained in small cages (0.0156 m 3 ) with non-overlapping 385 generations (Kyrou et al. 2018). We observed similar dynamics of spread in duplicate large cages (4.7 386 m 3 ) cages initiated with low or medium frequency of the drive, leading to complete population 387