Construction of expression cassettes and helper-dependent adenovirus (HDAd)
We previously reported construction of the plasmid pBshuttle-4XETE-gApoAI-oPRE and the corresponding HDAd vector HDAd-4XETE-gApoAI-oPRE.(22) Both constructs contain the rabbit APOAI gene driven by a modified murine Edn1 promoter (termed 4XETE; includes the Edn1 promoter and 3 added tandem repeats of the Edn1 enhancer) along with the optimized posttranscriptional regulatory element (oPRE) of the woodchuck hepatitis virus(21) (Supplementary Fig. S1). We noted that the Edn1 promoter contains a 5´-GAGACC-3´ sequence in the “noncoding” DNA strand. This sequence, reported to be a shear stress response element (SSRE), was originally identified in the platelet-derived growth factor B chain promoter(27) and was shown to function in vascular endothelial cells, when placed in the coding strand, upstream of synthetic promoters.(28) To enable testing of an SSRE in the coding strand, we first mutated the endogenous 5´-GAGACC-3´ sequence in the noncoding strand (QuikChange Lightning Site-Directed Mutagenesis Kit; Agilent Technologies, Santa Clara, CA) to 5´-ATGTCA-3´, yielding pBshuttle-4XETE-gApoAI-oPRE-mSSRE (Supplementary Figure S1). As described in Results, this plasmid was not useful, and all remaining plasmids were constructed with the noncoding 5´-GAGACC-3´ sequence intact.
We used site-directed mutagenesis to insert the 5´-GAGACC-3´ sequence into the coding strand of pBshuttle-4XETE-gApoAI-oPRE, just upstream of the 4XETE sequence. The new plasmid was named pBshuttle-SSRE-4XETE-gApoAI-oPRE (Supplementary Fig. S1). Homologous recombination was used, as described,(22) to transfer the SSRE-4XETE-gApoAI-oPRE sequence from pBshuttle-SSRE-4XETE-gApoAI-oPRE to the HDAd backbone plasmid pC4HSU (Microbix Biosystems, Toronto, Ontario, Canada).(29) The resulting plasmid was used, along with 293Cre4 cells and H14 helper virus,(30) to generate HDAd-SSRE-4XETE-gApoAI-oPRE. Concentrated HDAd vector preparations, characterized as described,(22, 31) were 1.4 to 2.8 1012 viral particles/mL with helper virus contamination <1% and E1A-containing genomes <1 in 106 viral genomes.
To construct additional expression cassettes in which DNA sequences were tested for their ability to increase APOAI expression above levels obtained with the 4XETE-gApoAI-oPRE cassette, we first used site-directed mutagenesis to insert MluI and BglII restriction sites upstream of the 4XETE sequence in pBShuttle-4XETE-gApoAI-oPRE, generating pBShuttle-MB-4XETE-gApoAI-oPRE (Supplementary Fig. S1). We then constructed an oligomer containing a 44-bp murine Mef2c enhancer,(32) flanked by MluI and BglII restriction sites on one end and a BamHI restriction site (Integrated DNA Technologies, Coralville, IA) on the other end. We ligated 1 copy of this oligomer into MluI-BglII-digested pBShuttle-MB-4XETE-gApoAI-oPRE, generating pBshuttle-1XMEF2C-4XETE-gApoAI-oPRE (Supplementary Fig. S1). We used the controlled and ordered oligonucleotide ligation procedure(33) to generate pBshuttle-2XMEF2C-4XETE-gApoAI-oPRE (Supplementary Fig. S1), as well as pBshuttle-3XMEF2C-4XETE-gApoAI-oPRE, pBshuttle-4XMEF2C-4XETE-gApoAI-oPRE, and pBshuttle-5XMEF2C-4XETE-gApoAI-oPRE (not shown).
To construct expression cassettes in which putative DNA enhancer sequences were tested for their ability to increase APOAI expression from a cassette containing only 1 copy of the Edn1 enhancer (i.e., 1XETE-gApoAI-oPRE; Supplementary Fig. S1), we used site-directed mutagenesis to remove the 3 added copies of the Edn1 enhancer (total 160 bp) from pBshuttle-MB-4XETE-gApoAI-oPRE, leaving the MluI and BglII sites intact. The product of this reaction, pBshuttle-MB-1XETE-gApoAI-oPRE (Supplementary Fig. S1) was used, along with the Mef2c oligomers mentioned above and the controlled and ordered oligonucleotide ligation procedure, to generate pBshuttle-1XMEF2C-1XETE-gApoAI-oPRE, pBshuttle-2XMEF2C-1XETE-gApoAI-oPRE, etc. (Supplementary Fig. S1; 3X 4X and 5XMEF2C constructs are not shown). We also used site-directed mutagenesis to delete the MluI and BglII restriction sites from pBshuttle-MB-1XETE-gApoAI-oPRE, generating a control plasmid, pBshuttle-1XETE-gApoAI-oPRE (Supplementary Fig. S1).
We used a similar approach to construct expression cassettes in which single copies of endothelial cell (EC) cis-regulatory modules (CRM; identified as described below) were inserted upstream of the 4XETE and 1XETE sequences in pBshuttle-MB-4XETE-gApoAI-oPRE and pBshuttle-MB-1XETE-gApoAI-oPRE. Essentially, we used PCR and human DNA from 293 Cre cells(30) as a template to generate amplicons containing each of 11 CRM (all with introduced 5´ MluI sites and 3´ BglII sites), we digested the amplicons with MluI and BglII, and ligated the products into the MluI/BglII-digested plasmids. These expression cassettes were termed CRM(1–11)-1XETE-gApoAI-oPRE and CRM(1–11)-4XETE-gApoAI-oPRE (Supplementary Fig. S1).
We also constructed 4 expression cassettes in which APOAI expression was driven by large (2 035 – 8 880 bp) genomic segments of each of 4 genes that are highly and relatively specifically expressed in EC: VWF, THBS1, EFEMP1, and CDH5. To identify these 4 genes, we began by using publicly available search engines and PubMed, along with search terms that included “endothelium genes,” “endothelium-specific genes,” and “endothelial cell specific.” These searches identified 30 candidate genes (Supplemental Table 1). We consulted the relevant publications and confirmed that each publication reported expression of these genes in EC. We then consulted the Gene Expression Atlas (http://www.ebi.ac.uk/gxa/home), which includes quantitative transcription data generated from the ENCODE project.(34) For each of the 30 genes, we interrogated ENCODE-derived data in the Gene Expression Atlas using: endothelial cell-derived cell line, the gene name, ENCODE – long polyA RNA, and whole cell. The output of this interrogation yielded a table that reports relative expression levels of the 30 genes in 18 cell lines, including cultured human umbilical vein EC (HUVEC; Supplementary Table S1). EFEMP1 and THBS1 were by far the most highly expressed genes in HUVEC and had relatively EC-specific expression. VWF and CDH5 were the next most highly expressed genes in HUVEC and were expressed only in HUVEC.
To determine which genomic sequences to incorporate in the 4 expression cassettes, we consulted papers that identified cis-acting positive regulators of transcription located near the promoters of the 4 genes.(12, 35-44) We were also careful to include the genomic regions in which the CRM were located (see below). Into these genomic sequences, we ligated a cassette that contains elements of the rabbit APOAI gene (beginning with the start codon, and including all 3 coding exons, both introns, and 51 bp of 3´ untranslated region), as well as the oPRE and SV40 polyadenylation signal. This APOAI cassette was inserted at the translational start sites of each of these 4 genomic sequences (Supplementary Fig. S2). The VWF genomic sequence extends from 843 bp upstream of the transcription start site to the translation start site, including the promoter, first exon, and first intron (total 2 322 bp). We used 2 segments of the EFEMP1 gene, with one segment placed upstream and one segment downstream of the APOAI gene. The upstream sequence extends from 243 base pairs upstream of the EFEMP1 transcription start site to immediately upstream of the EFEMP1 start codon, including the promoter, first exon, first intron, and part of the second exon (total 1 942 bp). The 3´ segment begins immediately downstream of the EFEMP1 start codon, extends for 5 676 base pairs downstream of the EFEMP1 translation start site (including the 2nd, 3rd, and 4th exons, the 2nd and 3rd introns, and part of the 4th intron). We used 2 separate segments of the EFEMP1 gene because CRM were identified both 5´ and 3´ of the EFEMP1 translation start site, and because leaving exon 2 (containing the EFEMP1 transcription start site) intact would potentially result in transcription and translation of a chimeric mRNA that included both EFEMP1 and APOAI sequences. The THBS1 genomic sequence extends from 1 270 bp upstream of the transcription start site to the translation start site, including the promoter, first exon, first intron, and part of the second exon (total 2 035 bp). The CDH5 genomic sequence includes 2 segments. The first segment extends from 6 721 bp upstream of the transcription start site through part of the first intron (total 8 821 bp). The second segment includes a sequence from the 3´ end of the first intron, including the splice acceptor site, and part of the second exon (total 59 bp). A plasmid containing the second segment for CDH5 and part of the 3´ end of the first segment was constructed by Integrated DNA Technologies.
We cloned these genomic regions by PCR amplification (Q5 High-Fidelity DNA Polymerase, New England Biolabs, Ipswich, MA) of human genomic DNA (Promega, Madison, WI) or of the human CDH5 gene-containing plasmid constructed by Integrated DNA Technologies. Primers were designed based on the human GRCh38/hg38 genome assembly. We constructed plasmids containing the VWF, EFEMP1, and CDH5 genomic sequences by ligation of PCR-amplified human genomic DNA to the rabbit APOAI gene and pBshuttle, using Gibson Assembly kits (New England Biolabs; Quantabio, Beverly, MA; GenScript, Piscataway, NJ). pBshuttle-THBS1-gApoAI-oPRE was constructed by GenScript. Plasmid identities were confirmed by restriction digestion (all plasmids) and by sequencing either across all of the Gibson Assembly junctions or across the entire insert (pBshuttle-THBS1-gApoAI-oPRE).
Identification of cis-regulatory modules (CRM)
To identify potential EC-specific CRM, we updated a computational approach that was used to identify liver-, cardiac-, and skeletal muscle-specific CRM.(24-26) As described above, we identified 4 target genes that are expressed highly and relatively specifically in EC (VWF, EFEMP1, CDH5, and THBS1). In the original approach to CRM identification, transcription factors were linked to target genes using transcription factor binding site (TFBS) predictions based on libraries of positional weight matrices; however, this approach is known to result in numerous false positives.(45)
Here we updated this approach to incorporate information contained in the ReMap 2015 ChIP-Seq database of 237 experimentally validated TFBS (http://tagc.univ-mrs.fr/remap/). This database, created by Griffon et al,(46) uses systematic integration of public non-ENCODE and ENCODE experimental data(47) to construct matrices that link TFBS to target genes, relying on the concept of regulatory potential (i.e., the likelihood that a gene is regulated by a transcription factor). Tang et al(48) modeled the influence of each TFBS on gene regulation as a function that decreases monotonically with increasing distance from the transcription start site of the gene. We used these tools to calculate—for each of the 237 transcription factors contained in the ReMap 2015 ChIP-Seq data sets—a regulatory potential on each of the 25 635 genes annotated in the human hg19 genome assembly. From the distribution of regulatory potential scores for every transcription factor, a p-value was calculated from the fraction of regulatory potential scores exceeding a given value. From these data, we constructed a general regulatory potential transcription factor – target gene matrix, in which genes are in rows and transcription factors are in columns. For every gene/transcription factor cell, the number 1 indicates if the gene is potentially regulated by the transcription factor at the specified p-value cutoff, the number 0 if it is not. For every specified p-value cutoff, a different general transcription factor – target gene matrix is constructed that can be used directly in the distance difference matrix approach, replacing the original positional weight matrices predictions-based transcription factor – target gene matrices.
We used the set of 4 genes that are expressed highly and relatively specifically in EC (VWF, EFEMP1, CDH5, and THBS1) and 1 000 background sets of 8 randomly selected genes as input for the distance difference matrix method.(49) Because the background sets are generated randomly, we avoided sampling or selection bias by repeating the experiment 7 times. To identify transcription factors that are consistently found among the top regulatory factors in replicate experiments, we used a rank-product meta-analysis algorithm.(50) This yielded 8 top-regulatory transcription factors using a q-value cutoff of 0.05 (AR, E2F7, ESR2, ETS1, GATA3, PRDM1, SNAPC1, and TAF7). Clusters of TFBS associated with these factors were then used to identify putative CRM in the vicinity of the loci of the highly expressed, EC-specific genes.
In view of the limited number of target genes, putative CRM were identified visually with the UCSC Genome Browser and the GRCh37/hg19 assembly. For this purpose, a custom browser guiding track containing the binding regions for the 8 top regulators was generated and uploaded. We used additional tracks to help identify putative CRM including the layered HUVEC H3K4Me1 track (regulatory elements), the layered HUVEC H3K4Me3 track (promoters), the layered HUVEC H3K27Ac track (active regulatory elements), the DNase I Hypersensitivity tracks (open chromatin; regulatory regions and promoters), and the Vertebrate Multiz Alignment & Conservation (100 species) track (sequences maintained by natural selection). Using these tools, we identified putative CRM consisting of clusters of top-regulatory TFBS and coinciding with the strongest epigenetic modifications favoring transcription in HUVEC, indicators of open chromatin regions, and highly conserved sequence elements. This approach identified 11 potential EC-specific CRM (Supplementary Table S2). The source code is available at https://github.ugent.be/pdbleser/Tfdiff_REMAP.
Testing expression cassettes in vitro
Bovine aortic EC (BAEC; Cell Applications Inc., San Diego, CA) were grown in Dulbecco’s Modified Eagle’s Medium (DMEM; Gibco, Grand Island, NY) supplemented with 10% fetal bovine serum (FBS) and 1% penicillin/streptomycin and were used at passages 5-8. Human aortic endothelial cells from a single donor, 37-year-old female, (HAEC; Lonza, Walkersville, MD) were grown in EBM-2 Basal Medium supplemented with EGM-2 SingleQuots Supplements (Lonza) and used at passages 4-7. Pooled Human Umbilical Vein Endothelial Cells (HUVEC; Lonza, Walkersville, MD), were grown in EGM-2 Basal Medium supplemented with EGM-2 SingleQuots Supplements (Lonza) and used at passages 1-6.
To test the mSSRE-containing expression cassette, BAEC were grown in 35-mm dishes to 80% confluence and transfected with plasmid DNA. These and all other transfections were done with jetPRIME transfection reagent (Polyplus, New York, NY), using the manufacturer’s protocol. Six hours after transfection, the cells were washed with DMEM and their medium was replaced. Fifty-one hours after transfection, the cells were harvested for RNA and DNA quantitation. To test the expression cassette containing an exogenous SSRE, BAEC were grown to 80% confluence in 6-well plates and transfected with plasmid DNA. Five hours after transfection, the cells were washed with DMEM and their medium was replaced. Twenty-four hours after transfection, the cells were harvested for RNA and DNA quantitation. In all cases, cells were harvested and lysed either by the addition of Buffer RLT Plus (Qiagen) + 2-mercaptoethanol (for RNA and/or DNA quantitation), the addition of Buffer RLT (Qiagen) + 2-mercaptoethanol (for RNA quantitation), or the addition of TRI Reagent (Zymo Research, Irvine, CA).
To test expression cassettes containing Mef2c oligomers, CRM, or genomic constructs, BAEC were grown to 80% confluence and transfected in 6-well plates. Five hours after transfection, the cells were washed twice with DMEM and 600 µL of DMEM was added to the wells. Twenty-four hours after transfection, the conditioned medium was removed, and the cells harvested for RNA and DNA quantitation. The CRM-containing expression cassettes and the genomic constructs were also transfected into HAEC and HUVEC, in 12-well plates. Four hours after transfection, the cells were washed with PBS, and 1 mL of fresh medium was added to the wells. Twenty-four hours after transfection, the cells were harvested for RNA and DNA quantitation.
We purified EC RNA and DNA with the AllPrep DNA/RNA Mini Kit (Qiagen, Germantown, MD). Alternatively, RNA and DNA were purified separately with the Direct-zol RNA Miniprep Kit (Zymo Research) and the DNeasy Blood and Tissue Kit (Qiagen). RNA was digested with DNase I (Thermo Fisher Scientific, Waltham, MA) to remove contaminating genomic DNA; RNA samples from HAEC and HUVEC were digested with both DNase I (New England Biolabs) and PvuII (New England Biolabs) to remove contaminating genomic DNA. Use of PvuII was required to ensure digestion of human APOAI genomic DNA. RNA was reverse-transcribed and amplified with the Luna Universal Probe One-Step RT-qPCR Kit (New England Biolabs). Alternatively, RNA was reverse-transcribed with the qScript Flex cDNA Kit (Quantabio) and the APOAI cDNA was measured by quantitative real-time PCR amplification using PerfeCTa FastMix II (Quantabio). APOAI mRNA levels were normalized to GAPDH mRNA levels, measured in cells from the same well. Normalized APOAI expression levels were further normalized for transfection efficiency, assessed by measuring plasmid DNA in extracts of cells in the same well (for BAEC) or of cells in a separate well transfected in parallel (for HAEC). Primers and probes for qPCR (Supplementary Table S3) were ordered from Integrated DNA Technologies. When Ct values for GAPDH or plasmid DNA suggested a failed PCR or failed transfection in a well, results from that well were omitted. These criteria, which resulted in exclusion of results from a small number of wells, were applied uniformly to all plasmids.
In vivo gene transfer to rabbit carotid arteries
Rabbits were fed a standard laboratory diet (16% rabbit PLT, Albers Animal Feed, Bellevue, WA). After 1 week of acclimation to the animal facility, 16 male New Zealand White rabbits (3.0–3.5 kg, Western Oregon Rabbit Co., Philomath, OR) underwent surgical isolation of both common carotid arteries, followed by temporary occlusion, and local luminal infusion of either HDAd-SSRE-4XETE-gApoAI-oPRE or HDAd-4XETE-gApoAI-oPRE (each at 2 1011 viral particles/mL diluted in DMEM) as previously described.(51) Carotid arteries were harvested either 3 or 28 days after gene transfer. Three spaced segments (proximal, mid, and distal) of each transduced artery were snap-frozen in liquid nitrogen for RNA analysis. Total RNA was extracted from these segments (RNeasy Mini Kit, Qiagen) and was quantified by Nanodrop spectrophotometer (Thermo Scientific, Wilmington, DE). APOAI mRNA was measured by quantitative reverse transcriptase-mediated PCR.(52)
Statistics
For comparing two groups, we used Student’s t-test if conditions of normality and equal variance were met. If not, we used a Mann-Whitney rank-sum test. For experiments that compared multiple groups (i.e., number of Mef2c enhancer copies, different CRM, or the 4 genomic knock-in constructs), we used the Kruskal-Wallis one-way ANOVA on ranks with Dunn’s method for post-hoc corrections for multiple comparisons.