Collection of GWAS SNPs of cardiovascular diseases
We have collected the significant SNPs from the NHGRI-EBI GWAS catalog32 for the following search terms including all the available child traits: Coronary artery disease (EFO_0000378), Aortic stenosis (EFO_0000266), Cardiac arrhythmia (EFO_0004269), Cardiomyopathy (EFO_0000407), Myocardial infarction (EFO0000612) and Myocardial ischemia (EFO0005672). All GWAS were downloaded on 10/26/2020 (see also Table S5).
For each set of GWAS SNPs, we have obtained correlated SNPs that are in linkage disequilibrium (LD) with any of the original SNPs. We used the LDProxy Tool33 and extracted the proxy SNPs via their API functionality. Proxy SNPs with an R2 > = 0.75 and within a window of +/- 500 000 bp centered around the original SNP were added to the GWAS SNPs. The combined set of proxy and lead SNPs was used as input set to SNEEP.
Detection of regulatory SNPs
To detect regulatory SNPs, we applied the SNEEP pipeline (https://github.com/SchulzLab/SNEEP) separately for each of the 6 cardiac GWAS. SNEEP (SNP exploration and analysis using epigenomics data) is a computational pipeline which identifies rSNPs along with the affected TFs and further links the rSNPs to putative target genes.
To compute whether or not a SNP alters the binding behavior of a TF, a differential binding score is determined, which is the log-odds ratio between the binding affinity of the wildtype sequence (containing the wild type allele) and the mutated sequence (containing the alternative allele).20 For the log-odds ratio a differential binding p-value is computed. The approximation of the p-value depends on the characteristics like length, CG content etc. of the used TF PWM-motifs. Therefore, one needs to estimate a scale value per motif using the script estimateScalePerMotif.sh from the SNEEP pipeline. To estimate the scale parameter values, we used 200 000 sampled SNPs from the dbSNP database34, and removed flanking bases of the PWMs with an entropy higher than 1.9, resulting in the following command:
bash estimateScalePerMotif.sh 200000 < pathToMotifs > < outputDir > < motifNames > 1.9
To run SNEEP, 632 human TF PWM-motifs in transfac format were gathered from the JASPAR database (version 2020)35 and over 2.4 million regulatory elements linked to their putative target genes were downloaded from the EpiRegio database31 (https://doi.org/10.5281/zenodo.3758494,file: REMAnnotationModelScore_1.csv.gz).
Next, we applied the main SNEEP pipeline per GWAS with a differential binding p-value cutoff of 0.001:
./differentialBindingAffinity_multipleSNPs -o < SneepOutputDirectory> -n 10 -p 0.5 -c 0.001 -r < EpiRegio_REMs> -g < mappingEnsemblIdToGeneNameForREMs> -j 100 -l 123 -i < pathToSneepDirectory> -s < estimatedScalesFromPreviousStep > < pathToJasparMotifs > < InputSnpsPerCardiacGWAS > < pathTohg38.fa>
The SNEEP result is provided in Table S6.
Identification of disease associated genes using rSNPs
As part of the analysis of SNEEP, all rSNPs that overlap regulatory elements from Epiregio constitute a candidate disease gene. From the SNEEP output file protein-coding and non-coding genes were extracted that have overlapping rSNPs in their regulatory elements. Non-coding genes were understood to be all genes not labeled with the biotype ‘protein coding’ or ‘TEC’ (primary assembly annotation, version 39, downloaded from GENCODE).36
To label which protein-coding genes are already associated with the studied diseases (Fig. 2B, black dots), we used the disease2gene functionality of the R package of DisGeNET:
disease2gene(disease = < studiedDisease>, database = "ALL", score = c( 0,1))
The studiedDisease parameter needs to be provided as UMLS CUI identifiers. We used C1956346 (Coronary artery disease, CAD), C0003811 (Cardiac arrhythmia), C1449563 (Cardiomyopathy, Familial Idiopathic), C0151744 (Myocardial ischemia), C0027051 (Myocardial infarction) and C0340375 (Subaortic stenosis) as studiedDisease (see also Table S2).
Identification of co-expressed genes for non-coding genes and disease enrichment
We conducted a co-expression analysis using the gene expression profiles of 9,662 GTEx RNA-seq samples.37 We compared the expression of protein-coding genes with the non-coding genes using the Spearman correlation coefficient as the similarity metric (Table S7). This allowed us to obtain a ranked list of protein-coding genes most similar to the expression of a selected non-coding gene in the GTEx data.
For each non-coding gene the top 10 co-expressed protein coding genes were extracted, varying this number did not change the further results. The joint set of all protein-coding genes that are co-expressed to any of the non-coding genes found for the same disease via the GWAS analysis, were considered to perform a disease enrichment analysis. The analysis was done separately for the resulting co-expressed protein-coding gene sets derived from Cardiac arrhythmia, CAD and Cardiomyopathy using the function disease_enrichment from the R package of DisGeNET.38
disease_enrichment( entities = < coExpressedProteinCodingGenesPerGWAS>, vocabulary = "HGNC", database = "ALL" )
From the resulting list of enriched diseases (Table S3), as part of the enrichment computation, cardiac phenotypes were selected and visualized as a dot plot using ggplot2 (Fig. 3C). For the GWAS Aortic stenosis, Myocardial infarction and Myocardial ischemia we did not apply the disease enrichment analysis because of the low number of associated non-coding genes (less than 30).
Preparation and maintenance of human iPSC-CMs
Human induced pluripotent stem cells (hiPSCs) were purchased from Cellular Dynamics International (CMC-100-010-001) and cultured according to the manufacture’s protocol. The human iPSC-derived cardiomyocytes (hiPSC-CMs) were reprogrammed using the STEMdiff™ Cardiomyocyte Differentiation Kit (STEMCELL Technologies) as recommended by the manufacturer. Briefly, human iPSCs were plated at cell density of 3.5×105 cells/well on Matrigel coated 12-well-plates using mTeSR™ medium supplemented with 5µM ROCK inhibitor (Y-27632, STEMCELL Technologies) for 24 h. After 1 day (-1), the medium was replaced with fresh TeSR™ medium. To induce cardiac differentiation, the TeSR™ medium was replaced with Medium A (STEMdiff™ Cardiomyocyte Differentiation Basal Medium containing Supplement A) at day 0, Medium B (STEMdiff™ Cardiomyocyte Differentiation Basal Medium containing Supplement B) at day 2, Medium C (STEMdiff™ Cardiomyocyte Differentiation Basal Medium containing Supplement C) at day 4 and day 6. On day 8, medium was switched to STEMdiff™ Cardiomyocyte Maintenance Medium with full medium changes every 2 days, to promote further differentiation into mature cardiomyocyte cells. All experiments were performed in the hiPSC-CMs at day 40. Hypoxic condition was achieved by using the Hypoxia chamber and the hiPSC-CMs were cultured at either 3% or 1% O2 for 2 days.
Monoculture cardiac organoid formation Technique
Monoculture cardiac organoids (MCOs) were created by hiPSC-CMs. Aggrewell™ 800 microwell culture plates were used to create the MCOs in STEMdiff™ Cardiomyocyte Support Medium (STEMCELL Technologies). At day 18, hiPSC-CMs were distributed into Aggrewell™ 800 microwell culture plates at a density of 900,000 hiPSC-CMs/well. After 2 days of culture, medium was switched to STEMdiff™ Cardiomyocyte Maintenance Medium for long term culture. Hypoxic condition was achieved by using the Hypoxia chamber and the MCOs were cultured at either 3% or 1% O2 for 3 days.
Self-organized Cardiac Organoids formation Technique
Human iPSCs were plated at cell density of 1.5×105 cells/well on Aggrewell™ 800 microwell culture plates to form embryoid bodies (EBs). At day 18 and day 20, 50 nM VEGF and 25 nM FGF were added into the Maintenance Medium. On day 22, medium was switched to Maintenance Medium with EGM-2 with full medium changes every 2 days. All experiments were performed in the self-organized cardiac organoids (SCOs) at day 40. Hypoxic condition was achieved by using the Hypoxia chamber and the SCOs were cultured at either 3% or 1% O2 for 3 days.
RNA isolation, reverse transcription and qRT-PCR
Samples were harvested in QIAzol Lysis Reagent (QIAGEN), and total mRNAs were isolated with RNeasy Kit (QIAGEN) according to the manufacturer’s protocol. 200ng total RNAs were reverse transcript into cDNA using QuantiTect Reverse Transcription Kit (QIAGEN) according to the manufacturer’s protocol. The Applied Biosystems StepOnePlus Real-Time PCR system (Applied Biosystems, CA, USA) with Fast SYBR Green Master Mix (Thermo Fisher Scientific) were used for analysis. Gene expression levels were normalized against the housekeeping gene HPRT1. The qRT-PCR primers are listed in Table 1.
Antisense LNA GapmeRs
Antisense LNA GapmeRs were purchased from Qiagen. Four different antisense LNA GapmeRs were designed for each target (Table 2). The GapmeR negative control was used as the control.
Contractility measurement
Every single MCO was transferred into 96-well-plate and treated with either GapmeR control, GapmeR RP11-98F14.11, GapmeR RPL23AP92 or GapmeR IGBP1P1. Hypoxic condition was achieved by using the Hypoxia chamber and cardiac organoids were cultured at 3% O2 for 3 days, then the contractility will be analyzed by IonOptix system. Units/pixels were determined by calibrating the system with a micrometer.
Calcium Transient measurements
The MCOs were cultured in the 96-welll-plate. 2µM Cal-520 AM (AAT bioquest) with 0.04% Pluronic® F-127 (AAT bioquest) working solution was added into the plate, and then the plate was incubated in the incubator at 37°C for 90 min. After washing the MCOs with PBS for 3 times, and with medium for 2 times, the MCOs will be transferred to the 384 well U-bottom plate and incubated at 37°C for 1 h. The fluorescence will be measured by the fluorescence plate reader.
Immunofluorescence staining
Immunofluorescence staining was performed as described previously. After fixation with 4% paraformaldehyde (PFA)/PBS, the hiPSC-CMs or MCOs were permeabilized and incubated overnight at 4ºC with primary antibodies against sarcomeric α-actinin (Sigma Aldrich), diluted in 2% (v/v) HS/PBS. After 3 washes with PBS for 5 min, cells were incubated with 4',6-diamidino-2-phenylindole (DAPI Thermo Fisher Scientific) and AlexaFluor 555 anti-mouse (Thermo Fisher Scientific) secondary antibody for 1 h at room temperature. Dishes were mounted onto glass slides (Fisher Scientific) with a drop of ProLong™ Gold Antifade (Thermo Fisher Scientific). Fluorescent images were acquired with the SP5 confocal microscopy (Leica) using a 40x magnification. Cell size was quantified blindly using the software Image J.
Statistical Analysis
Data are represented as mean and error bars indicate the standard error of the mean (SEM). Two-tailed unpaired Student’s t-tests (Excel) or one-way ANOVA analyses followed by either a Dunnett’s multiple comparison post-test (multiple comparisons to a single control) or Bonferroni correction (multiple comparisons between different groups) were used as indicated in the respective figure legends. ns = not significant; *P < 0.05; **P < 0.01; %P < 0.05.