Monkeypox viruses circulate in distantly-related small mammal species in the Democratic Republic of the Congo

We determined near complete and complete monkeypox virus genomes in a shrew (Crocidura littoralis), two squirrels (Funisciurus anerythrus, Funisciurus bayonii), and produced shorter sequences from two rats (Stochomys longicaudatus, Cricetomys sp. 2) originating from the Democratic Republic of the Congo. This suggests that a number of rodents besides squirrels (families Muridae and Nesomyidae) and shrews (order Eulipotyphla) are potential monkeypox virus reservoirs.

retrieved two complete MPXV genomes from F. anerythrus and C. littoralis, and one partial genome (59%) from F. bayonii. Therefore, MPXV appears to naturally circulate in multiple, distantly-related rodent species from at least three different families (Sciuridae, Muridae and Nesomyidae), as well as in shrews (order Eulipotyphla). While some of these detection events may re ect dead-end infections, they are compatible with the hypothesis that MPXV has a broad host range and possibly a complex multi-host ecology 18 . The ability to infect multiple host species is also known for other OPXVs, including the betterstudied cowpox virus (CPXV) which infects at least three rodent species ( eld voles, bank voles and wood mice) in the UK, sometimes leading to spillovers into bovids 14 . Because OPXVs are relatively stable in the environment and able to infect susceptible hosts via multiple routes (respiratory, mucosal and parenteral), cross-species transmission of OPXVs is likely facilitated by indirect transmission and might happen frequently between species 18 .
Although MPXV is able to infect multiple hosts, it does not necessarily imply that all hosts contribute to viral persistence. In fact, it is often suggested that the majority of transmission events and pathogen persistence is associated to a key host species 19 . Consequently, much of the apparent complexity of such multi-host systems could be simpli ed by focussing on one or two key hosts within the community. The key position of these hosts may be the outcome of separate processes such as infection prevalence, population density and transmissibility 19 . For example, while CPXV can be maintained in wood mice populations in the absence of other mammal species (potentially due to their high population abundance), its transmission in bank voles populations largely depends on the presence of wood mice or eld voles in the wildlife community 20 . It is possible that the maintenance of MPXV also depends on one or two primary animal reservoirs and that other species serve as auxiliary hosts. The results reported here adds to the limited evidence pointing at Funisciurus squirrels as potential primary reservoirs for MPXV. Squirrels from this genus show the highest seroprevalence of all rodents and they have allowed for the isolation of the virus and now for its whole genome sequencing from multiple individuals 11,12,15 . Within this genus, F. anerythrus maybe sticks out as a particularly plausible primary host due to its social behaviour (more interaction with conspeci cs) and higher densities than any other species of African tree squirrel, both characteristics facilitating viral intra-species transmission 21 .
Nevertheless, the fact that MPXV was rst isolated from a Funisciurus squirrel might have biased our view on MPXV ecology, as most follow-up studies targeted squirrels and potentially neglected other animal orders, such as shrews or bats. Similarly, the multimammate mouse was for a long-time assigned to be the sole Lassa arenavirus (LASV) reservoir, as the virus was rst isolated from this rodent species which is highly abundant around houses in areas where the disease is endemic. However, recent studies also detected LASV in rodent communities even in the absence of the multimammate mouse 22 . This suggests that the concept of a 'primary reservoir' might often depend on the context. For example, it might be that MPXV transmission is maintained by Funisciurus squirrels in areas where these rodents are abundant, while other species are needed for viral persistence in areas where Funisciurus squirrels are less abundant, or absent. Additional eld studies targeting a larger variety of mammals and following populations over time are necessary to con rm the hypothesis that Funisciurus squirrels are indeed the primary MPXV reservoirs, or to support the hypothesis of 'combined maintenance' in the wildlife community.
Maximum likelihood (ML) and Bayesian phylogenetic analyses revealed that the MPXVs found in the red squirrel (F. anerythrus) and shrew are closely related to each other and fall within the diversity of MPVX strains found in humans in Central Africa (Fig 1), with the time to the most recent common ancestor (tMRCA) with the closest published genomes dated to 1959. Similarly, ML analyses performed on the partial MPXV genome found in the F. bayonii showed that it clusters with other MPXV-CB strains ( Supplementary Fig. 1). The presence of the new animal-derived sequences provided an opportunity to reassess the standing hypothesis that MPXV diversi cation essentially re ects ancient vicariance events caused by climate-induced forest-habitat contractions, such as the opening of the Dahomey gap (4,500BP) or the Last Glacial Maximum (21,000BP) 2 . We found, however, that such diversi cation was too recent to be compatible with these events, with the MPXV tree root being about 600 years-old in the best model. However, given the relatively modest divergence between MPXV-WA and MPXV-CB, we do not expect this would affect the estimated evolutionary time scale of MPXV to such an extent that it would become compatible with key climate factors (4,000-25,0000BP) as proposed in Nakazawa et al 2 . Rather, we consider that it is more likely that the geographical isolation of the two clades has occurred more recently (e.g. during the Medieval Warm Period; 700-1,100BP) 23 or that alternative evolutionary mechanisms were at play. The WA and CB MPXV viruses may for example circulate in distinct hosts communities, with MPXV-CB depending on hosts strictly endemic to Central Africa for its long-term persistence.
In conclusion, the presence of MPXV DNA in diverse forest dwelling rodents and shrews sheds a new light on the ecology of MPXVs. This nding calls for increased surveillance in a wider range of African mammals to reveal the host range and distribution of MPXV, and possibly to discern whether different animal groups host distinct MPXV strains with distinct virulence.

Methods
Small mammals were trapped at different locations in Ituri and Tshopo provinces of the DRC between 2010 and 2016 using Sherman live traps, pit falls and mist nets (2074 rodents, 992 insectivores and 525 bats). Captured animals were humanely euthanised (using iso urane) and necropsied in situ. Tissues samples (kidney, liver and spleen) were stored in 75% ethanol and shipped to the University of Antwerp (Belgium), where samples were stored in the dark at room temperate. DNA was extracted on a mixture of the different organs using the NucleoSpin tissue kit (Machery Nagel, Düren, Germany) according to the manufacturer's instructions. Species identi cation was done based on morphological measurements and con rmed with mitochondrial DNA cytochrome b sequencing using L7 and H6 primers and 16S rRNA for shrews. The sequences obtained were compared with known sequences from GenBank and the African Mammalia database (http://projects.biodiversity.be/africanmammalia).
For MPXV screening, a TaqMan Real-time PCR targeting the OPXV P4A gene of all orthopoxviruses was performed in duplo on a subset of samples (see supplementary data: excel 1) 17 . Positives, were con rmed using a conventional PCR targeting a 270bp fragment of the MPXV haemagglutinin (HA) gene 16 , Sanger sequenced and shipped to Robert-Koch Institute for whole genome sequencing as described in Patrono et al 10 . In short, Illumina-compatible libraries were built from fragmented DNA (250ng DNA in 50 µl EDTA TE buffer) using the NEBNext Ultra II kit and quanti ed using the KAPA HiFi library quanti cation kit. To increase the relative amount of MPXV DNA in the libraries with respect to background DNA, two rounds of 24h hybridization capture at 65°C were performed using 2-fold tiling 80mer RNA baits (MYBaits®) designed to target orthopoxviruses 10 . The enriched pool was diluted to 4nM and sequenced on a NextSeq platform (2x150 bp). Raw NGS reads were processed following the pipeline described in Patrono et al 10 with an additional step to merge paired reads using the ClipAndMerge tool from Eager 24 . Sequences KJ642616 and JX878411 were used for reference-based mapping to either a West African of Congo Basin clade MPXV genome, respectively. Since the two ends of MPXV genomes are inverted terminal repeats (ITRs) that contain many repetitive regions and indels that can lead to di cult or erroneous assembly, the right ITR was removed from both reference sequences (from position 190375 and 191567 in the KJ642616 and JX878411 sequence, respectively). Consensus sequences were called at positions covered by at least 20 reads (95% agreement) for MPXV from F. anerythrus and C. littoralis, whereas for F. bayonii calling was set to at least 2 reads (less on-target reads were obtained for this genome).For phylogenetic analyses, we downloaded all publicly available MPXV complete genomes from the National Center for Biotechnology Information (NCBI) database (n=61) and added the two complete genome sequences herein generated (accession numbers: MT724770 and MT724772). Genomes (n=63) were aligned using MAFFT v7 and conserved blocks were selected using GBlocks in Seaview v452, resulting in an alignment of 178,345 positions (see supplementary le: GBlocks). Upon removal of sites containing ambiguities (n=2618), the alignment was reduced to variable sites (n=2176) by stripping all identical sites (n=173,551) and collapsed to unique sequences (n=50) with the online tool FaBox v1.453. We ran recombination analyses with RDP4 25 and removed minor recombining fragments. We then ran exploratory phylogenetic analyses, to determine whether a temporal signal was likely to exist in this dataset. For this we used PhyML v.3 with smart model selection 26 (GTR) to obtain an unrooted maximum likelihood (ML) tree ( Supplementary Fig 1). A rst analysis of the resulting tree in TempEst 27 identi ed one outlier sequence (AF380338) that showed a suspiciously long branch; upon removal of this sequence a moderate temporal signal was detected (Rsq=0.33).
We went on by running formal molecular clock analyses in a Bayesian framework using BEAST v1.10.4 28 . We assumed a coalescent tree shape prior (constant population size), performed multiple Bayesian Markov chain Monte Carlo (BMCMC) runs and checked multiple run convergence as well as appropriate sampling of the posterior (effective sample size > 200) using Tracer v1.7 29 . We rst wanted to con rm the presence of a temporal signal. To do this, we applied a recently suggested approach whereby for the same dataset for which tip dates are available the ts of two models are compared, one of which is informed by these tip dates (heterochronous model) while the other is not (isochronous model) 30 . We estimated model marginal likelihoods using a generalized stepping stone procedure 31 and compared models using Bayes factors (BF); we considered support for one of the competing models was decisive where 2lnBF>10. We implemented the two most extreme forms of molecular clocks, i.e. a strict molecular clock where a single substitution rate is applied across the entire phylogeny and an uncorrelated relaxed clock where the substitution rate is allowed to vary from branch to branch and is drawn from a lognormal distribution. BF comparison showed the presence of a temporal signal under both models (Table S1). Maximum clade credibility (MCC) trees were generated using TreeAnnotator v.1.10.4, using combined tree les generated with LogCombiner v1.10.4 (both softwares are distributed with BEAST) 28 . ITol v5 was used for both tree visualization (https://itol.embl.de/). Branch robustness was assessed through posterior probabilities.
For the F. bayonii partial genome sequence, we ran ML analyses in PhyML as described above (Supplementary Fig 2). Chronogram derived from an alignment of monkeypox whole genome sequences and map of Africa to show sequence origin based on a molecular clock analysis. Sequences derived from specimens from the Congo Basin are in red, from West Africa in blue, the USA and Europe in black and wildlife derived sequences in bold italic. Human MPXV sequences are represented as dots on the map and animal sequences as triangles. The inner branch colours represent posterior probabilities (grey is <0.95; black is ≥0.95). Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.