HHi-FiVe: A high- delity genetic engineering pipeline for construction of herpesvirus-based vaccines

Michael Jarvis (  michael.jarvis@plymouth.ac.uk ) University of Plymouth https://orcid.org/0000-0002-0124-4061 Thekla Mauch The Vaccine Group Ltd Eleonore Ostermann Heinrich Pette Institute Yvonne Wezel The Vaccine Group Ltd Jenna Nichols MRC-University of Glasgow Centre for Virus Research Ana da Silva Filipe MRC-University of Glasgow Centre for Virus Research Matej Vucak MRC-University of Glasgow Centre for Virus Research Joseph Hughes MRC-University of Glasgow Centre for Virus Research Summer Henderson The Vaccine Group Ltd Kimberli Schmidt UC Davis Amitinder Kaur University of California at Davis Hester Nichols The Vaccine Group Ltd Robin Antrobus University of Cambridge Peter Barry UC Davis Andrew Davison MRC-University of Glasgow Centre for Virus Research Wolfram Brune Heinrich Pette Institute


Introduction
Herpesvirus-based vectors show considerable promise for use as vaccines against infectious diseases. Several such vaccines have been approved for commercial use in agricultural animals, in which they are highly effective. These include a live vaccine based on bovine herpesvirus 1 (Bovilis® IBR Marker Live) to combat infectious bovine rhinotracheitis in cattle 1 and a live recombinant vaccine (VAXXITEK® HVT+IBD) based on turkey herpesvirus to counter infectious bursal disease virus in chickens 2 . Experimental herpesvirusbased vaccines have similarly shown an ability to produce substantial levels of immunity with protection against a range of targeted pathogens, including viruses such as simian immunode ciency virus 3,4 and Ebola virus (EBOV) 5− 7 , bacteria such as Mycobacterium tuberculosis 8, 9 , and protozoa (Plasmodium knowlesi) 10 .
Herpesvirus-based vectors have several key features that have encouraged their development as vaccines 11 . These include an inherently low pathogenic potential, an ability to induce durable levels of antibody-based and T cell-mediated immunity, and a potential for administration via mucosal (i.e. oral and nasal) routes 11 . The vaccines are also amenable to reuse, as prior vector-speci c immunity does not prevent reinfection 3,5,11,12 . These features, combined with high host species restriction and the ability to spread among individuals, have motivated the development of transmissible herpesvirus-based vaccines for targeting emerging zoonotic pathogens in the inaccessible wildlife animal populations from which they frequently arise 13,14 . Advances in bacterial arti cial chromosome (BAC)-based genetic engineering have played a large part in the development of technology for manipulating the vectors 15 . Nonetheless, compared to other vaccine modalities, the large genome sizes of herpesviruses and the potential for off-site mutation during manipulation present signi cant challenges to the widespread use of herpesvirus-based vectors as vaccines, especially in emerging zoonotic disease scenarios, where it is critical to respond rapidly while ensuring the accuracy of vaccine construction.
We have established a robust approach for iterative, high-delity genetic engineering of herpesvirus-based vectors. This approach was named the HHi-FiVe (herpesvirus high-delity vector) pipeline and was used to restore a total of 13 mutated or missing open-reading frames (ORFs) in a BAC containing a cytomegalovirus (CMV) genome from rhesus CMV (RhCMV) bearing a transgene expressing an EBOV antigen. This BAC (RhCMV68-1/EBOV BAC) was chosen as a starting point because the vector derived from it by transfection has been shown to protect rhesus macaques that were vaccinated subcutaneously and then challenged with normally lethal EBOV doses 5 . As anticipated, the vector reconstituted from the repaired BAC exhibited a phenotype characterized by restored epithelial cell tropism and sustained expression of the transgene (EBOV-GP). This work generated a repaired vector suitable for future model studies of animal-to-animal transmission and demonstrated the practicality of the HHi-FiVe pipeline for producing herpesvirus-based vectors for potential use as vaccines.

RhCMV BACs and ORF nomenclature
Following isolation from the urine of a rhesus macaque in 1968, a parental virus (RhCMV strain 68-1; RhCMV 68−1 ) was subjected to extensive and largely undocumented passage in cultured broblasts of human or rhesus macaque origin 16 . A stock of the resulting virus was used to construct a primary BAC (RhCMV 68−1 BAC) 17 , from which all available RhCMV BACs are derived. A succession of studies has shown that RhCMV 68−1 and RhCMV 68−1 BAC are highly mutated 18-23 , with the detail having been revealed progressively by the genome sequences of RhCMV 68−1 , RhCMV 68−1 BAC, derivatives of RhCMV 68−1 BAC, viruses generated from RhCMV 68−1 -based BACs, and other RhCMV strains ( Table 1). As a result, RhCMV 68−1 and RhCMV 68−1 BAC lack the functions of many genes required for cellular tropism and tness in vivo. It was necessary to repair these mutations in order to create a candidate for testing as a transmissible vaccine. Achieving this involved making multiple small-and large-scale repairs to RhCMV 68−1 /EBOV BAC and carrying out Illumina-based complete genome sequencing at each stage to monitor delity.  22 , when the RhCMV genome annotation was improved further and orthologous ORFs in different CMVs were denoted by the same name. The principal names were those of HCMV ORFs, supplemented by those of ORFs speci c to Old World monkey CMVs, which are pre xed by the letter O. This nomenclature is used below and in the genetic map of the nal product of the HHi-FiVe pipeline ( Figure 1). In addition, when available, the alternative names are provided in Table 2, and the 2012 names are speci ed below in parentheses after rst use of an inclusive name. Nucleotide descriptions are given in relation to the genome sequence regardless of ORF orientation.  In order to ensure that the RhCMV component of the repaired RhCMV 68−1 /EBOV BAC was as close in sequence as possible to the original RhCMV 68−1 genome as perceived to have existed prior to isolation and serial passage in cell culture 25 , it was necessary to identify mutations in RhCMV 68−1 BAC (and hence in RhCMV 68−1 /EBOV BAC) that have resulted in inactivated ORFs. This involved detailed examination of an alignment of all available RhCMV genome sequences, which at the time did not include several reported since by Taher et al (2020) 23 ; these recent sequences were incorporated at the end of the study and identi ed no additional mutations. This comparative exercise revealed a total of 13 putatively inactivated ORFs ( predicted amino acid substitutions. Given the error-prone nature of the RhCMV 68−1 sequence, the reality of these differences was not certain, and they were not targeted for repair.

Pipeline for repairing mutated ORFs
Targeted genetic manipulation of herpesvirus genomes is achieved by BAC-based recombineering followed by reconstitution of virus by transfection of BACs into permissive cells 27 . Off-site mutations are a concern when manipulating such large DNA constructs and reconstituting viruses. In the past, this problem has been addressed by creating viruses from revertant BACs in order to demonstrate that the intended manipulations are genetically and phenotypically reversible. However, this approach is regarded as inadequate because it does not control for off-site mutations that arise during reconstitution of virus; in our experience, this is often when such mutations occur. It is also not practical for vaccine development because of the labor-intensiveness and limited scope of phenotypic assays. To cope with this inherent vulnerability, we coupled BAC-based recombineering with responsive Illumina-based whole genome sequencing to create the HHi-FiVe pipeline for generating and validating BACs and reconstituted viruses ( Figure 2).
We set out to use this pipeline to repair the mutations in RhCMV 68−1 /EBOV BAC using BAC-based recombineering 4,28,29 . Recombinant BACs were screened initially by restriction fragment length polymorphism (RFLP) analysis to screen for appropriate changes to fragment mobility (Supplementary Figure 1). This was followed by whole genome sequencing of recombinant BACs at each stage.
Overall, the complete process was accomplished in nine steps (Table 2).

Small-scale repairs
Six inactivated ORFs (RL11B, RL11D, RL11E, UL36, UL119 and US12E) required small-scale repair (Steps 1 and 3-7). Most mutations were addressed by restoring the perceived original sequence to reinstate the integrity of the ORF. However, an initial attempt at repairing RL11B at Step 7, which consisted of removing two C residues in a C 11 homopolynucleotide tract to restore a C 9 tract, resulted consistently in a C 10 tract. Therefore, an alternative strategy was used that involved introducing synonomous substitutions within the tract. Repair of RL11G was also small-scale (see below).

Large-scale repairs
A total of 12 ORFs in UL/b' had undergone extensive deletion or rearrangement during passage of RhCMV 68−1 , and the six ORFs that were completely or partially missing as a result were not amenable to small-scale repair. Instead, the whole region was replaced by a wild type version based on RhCMV strain 19936 (Table 1), using three synthetic DNA segments that together encompassed this region (Steps 2 and 8). The product of Step 8, which still contained two frameshift mutations in RL11G (see below), was denoted Repair of RL11G RL11G contained two separate mutations: a CT insertion in a (CT) 2 tract, and further downstream, an A insertion in an A 7 tract. Each mutation resulted in a frameshift, the rst removing the transmembrane domain of the encoded protein and the second restoring the correct reading frame near the end of the ORF. The rst mutation was predicted to have been su cient to inactivate RL11G on its own.
RL11G is an orthologue of HCMV RL13 20 , which has been shown to mutate during viral growth in culture in all cell types tested 30,31 .
Therefore, its repair was reserved for the nal step (Step 9). This strategy was vindicated by the recent demonstration that a repaired version of RL11G in a BAC-derived version of RhCMV 68−1 mutates in rhesus broblast cell culture 23  In contrast to the results obtained with the RhCMV 68−1 /EBOV/RL11G − clones, reconstitution of RhCMV 68−1 /EBOV/RL11G + BAC clone 1 in RPE-1 cells generated a major mutation at passage 1 in these cells consisting of a 12,778 bp sequence extending from within RL1 to close downstream from RL11H that had been replaced by a 1,786 bp bacterial sequence (Sample P). The proportion of genomes in which RL11G had not been inactivated by this indel was close to 0 %. We conclude that virus reconstituted from the Cellular tropism of RhCMV/EBOV/RL11G − The purpose of repairing RhCMV 68−1 /EBOV BAC is eventually to examine its potential as a model transmissible vaccine platform for providing protective immunity against EBOV following animal-to-animal dissemination of the vaccine. The extent to which RL11G is required for dissemination remains to be determined, but the use of virus reconstituted from RhCMV 68−1 /EBOV/RL11G + BAC was precluded because of the instability of RL11G in various cell types tested following reconstitution and passage (Figure 3; data not shown), which is consistent with previous ndings for RhCMV 23 Figure 5B).

Discussion
This work is a proof-of-concept study aimed at establishing the HHi-FiVe pipeline for e cient, high-delity genetic manipulation of herpesvirus-based vectors as a means for providing a rapid turnaround platform for developing vaccines. Over the past decade, various recombineering methodologies have been invented that enable precise engineering of large DNA constructs on both the small and large scales. However, even following con rmation of the accuracy of the intended manipulations, off-target mutations remain a concern, especially when, as in this case, multiple iterative changes are made within a single BAC lineage. Added to this is the serious potential for mutation during reconstitution of virus from a BAC. To allay these concerns, we screened BACs initially by RFLP analysis and then assessed their full integrity by complete genome sequencing. We also sequenced various viruses reconstituted from the BACs generated in the nal two steps (RhCMV 68−1 /EBOV/RL11G − BAC and RhCMV 68−1 /EBOV/RL11G + BAC). Since the case was complex and the pipeline was untested, RFLP screening and genome sequencing were used extensively, often with multiple clones at each step.
In total, 974 BACs were subjected to RFLP analysis and 83 BACs and 16 reconstituted viruses were examined by whole genome sequencing. Although most of the repairs were achieved as intended, some were problematic (e.g., the initial attempt at repairing RL11B at Step 7) or because off-site substitutions were introduced. In addition, sequencing the reconstituted viruses provided critical information on genetic stability, leading to the conclusion that virus reconstituted from RhCMV 68−1 /EBOV/RL11G − , unlike that reconstituted from RhCMV 68−1 /EBOV/RL11G + BAC, was stable in epithelial (hTERT RPE-1) cells.
We chose to establish the HHi-FiVe pipeline by repairing RhCMV 68−1 /EBOV because of the potential of the reconstituted virus to protect rhesus macaques challenged with lethal EBOV infection by transmitted, rather than parenteral subcutaneously administered vaccination. This decision also had the advantage of involving a case in which small-and large-scale repairs were accomplished in several steps, each consisting of multiple manipulations. The success of the pipeline in this situation indicates that it is likely to be broadly applicable. The complexity of this case is inherent in the development of research on RhCMV 68−1 , which, because the virus is highly mutated, lacks key phenotypic properties, including the ability to infect non-broblast cells, and in the fact that all available BACs are derived from this strain. This has led to previous attempts to restore wild type properties by repairing RhCMV 68−1 BAC. The resulting BACs include one in which UL36 and the region containing UL128, UL130 and UL131A were repaired 32 , and one containing a full-length genome that has been repaired more extensively 23 but retains the frameshifts in RL11B, RL11D and RL11E. These repaired BACs have formed an important prelude to experimentation on the immunobiology and pathology of RhCMV in its natural host. Our repair of RhCMV 68−1 /EBOV was, by contrast, vaccine-oriented and corrected all inactivated (i.e., prematurely terminated, frameshifted, deleted, or rearranged) ORFs.
In each of the instances described above, repairs were made by identifying mutated and nonmutated sequences from genome alignments of RhCMV 68−1 and other RhCMV strains. Since all the strains had been isolated in cell culture and were themselves potentially mutated, this involved a degree of interpretation, making it di cult to be sure that all mutations had been identi ed. A more straightforward approach would be to construct a BAC from a strain that has been sequenced directly from the host and passaged minimally, and then to repair the BAC accordingly, as has been done with HCMV 31 . However, this approach carries the inherent risk that any new BAC may represent a virus with phenotypic differences from those of RhCMV 68−1 , the immunology of which has been characterized extensively during its development as a vaccine platform (further details are below). The RhCMV case also raises the  (GenBank accession no. AF086833.2) with its 3' end extended to encode a 14 amino acid residue V5 epitope tag, followed by downstream noncoding sequences. Two unintended but inconsequential differences were noted in the RhCMV 68−1 /EBOV clone used for repair 5 : a non-synonymous substitution in EBOV-GP that results in an A to T amino acid substitution at codon 474 (a T residue is encoded at this position in some EBOV strains), and a noncoding substitution in the viral sequence very close to the right end of the transgene. was removed using a Kan marker and then repaired in three stages by en passant mutagenesis using three synthetic sequences comprising a wild type version of this region based on strain 19936 (KX689268.1).

RFLP analysis
Correct lambda Red recombination and en passant mutagenesis were con rmed using RFLP. Mutated and repaired RhCMV 68−1 /EBOV-  Table 1, and consisted of a minor population of full-length genomes present in a stock of RhCMV strain 180.92, which consisted mainly of genomes bearing a large deletion in UL/b' 41 . In this case, DNA was isolated from virus generated by transfecting a historical stock of puri ed virion DNA into Telo-RF cells.

Cellular tropism analysis
Multistep growth curves were conducted in triplicate in hTERT RPE-1 cells and Telo-RF cells in 6-well plates (5x10 4  Each dataset was quality-ltered using Trim Galore, sorted into sense and antisense reads using Samtools v. 1.13 (http://www.htslib.org) and mapped to the individual RhCMV 68−1 /EBOV/RL11G − ORFs using Bowtie 2 with the 'local' option. These ORFs included that of mutated RL11G and were supplemented by the sequence encoding one long noncoding RNA (RNA4.9). The number of reads mapping to each coding region was determined by visualising the alignment using Tablet and expressed as the number of reads per kbp per million sense or antisense reads mapping to all coding regions. The relative proportion of each RNA relative to the total was then calculated as a percentage for each dataset and expressed as an average.
Supplementary Table 2. Primers used for repair of ORFs in RhCMV 68-1 /EBOV Figure 1 Genetic map of RhCMV68-1/EBOV/RL11G+ BAC. The circular sequence is depicted in linear form, starting at the left end of the viral genome with one copy of the terminal direct repeat (TR), proceeding through the unique region (U), and ending with two more copies of TR. The copies of TR are shown in a thicker format than U. Protein-coding ORFs are indicated by coloured open arrows grouped according to the key at the foot to indicate gene families, other non-core genes that are not conserved among herpesviruses and core genes that are conserved among herpesviruses. Introns connecting protein-coding regions are shown as narrow white bars. UL72 is both a core gene and member of the DURP gene family and is shown as the latter. The BAC vector is shown by the grey-shaded region between US1 and US2. The EBOV-GP ORF encoding a V5-tagged EBOV spike glycoprotein is also grey-shaded and replaced US83B.
The locations of small-scale and large-scale repairs (see Table 2) are marked above the genome by yellow squares and bars, respectively.