iROP GWAS Multi-ethnic Cohort Characteristics:
The iROP cohort consists of 2187 preterm infants born at gestational ages less than 32 weeks and birth weights smaller than 1250 grams, placing them at risk to develop ROP. The iROP database was mined for infants with both biologic and phenotypic data for subsequent genotype-phenotype analysis. To identify genetic susceptibility regions for ROP, we conducted a genome-wide association study (GWAS) using the Illumina Infinium Global Screening Array with information from these infants. As published previously, phenotype was rigorously determined by a team of ophthalmologists and image graders with clinical expertise in ROP16,25,26. This subset of iROP patients represents multiple ethnic and racial groups, including 44.5% Hispanic, 35.4% White, 12.1% African American and 8% unidentified race individuals as noted in Table 1. Further, all ROP phenotypes are represented, including 197 with ROP Stages 3, 4, or 5. (Table 1). The distribution of ROP disease, ethnicity and race was similar between the infants with and without genetic data (Supplementary Table 2).
A number of SNPs demonstrate significant associations with severe ROP.
GWAS analysis was performed on DNA extracted from all 920 cohort infants to identify genetic susceptibility regions for ROP severity, defined as ROP stage 3 or greater. As depicted in Figure 1, which visualizes all p values of SNP-associations for the 22 autosomal chromosomes plotted relative to chromosome location and significance, a number of SNP associations were identified. The Y axis represents the -log transformed p value (red line representing GWAS significance level) and X axis is in genomic ordered by chromosomes and positions. SNPs reaching genome-wide significance, defined as p≤ 5x10-8, are visualized above the red line.
GWAS identifies ROP-associated SNPs with genome-wide significance or suggestive significance correlating with both novel and established ROP risk genes
As noted in Table 2, we identified 10 loci which demonstrate genome-wide significance (p≤5×10-8) or suggestive significance (p≤5×10-6) of association with ROP severity, defined by stage 3 disease or greater, both with and without correction for birth weight (BW) and gestational age (GA). The position, effect allele and frequency, odds ratio, p value and nearest gene is delineated in Table 2. SNPs demonstrating the lowest p value in each locus/region were selected and the bioinformatically determined nearest gene was used. Under the additive genetic model, the ROP risk corresponds to the number of copies of the effect allele when the odds ratio is greater than 1. As noted, SNPs rs2058019 (GLI3 gene) and rs11563856 (CLDN12 gene) both reach genome wide significance, although only rs2058019 remains significant at this level when controlling for the potentially confounding effects of BW and GA. In two of the loci meeting suggestive significance, the nearest gene has a prior association with ROP disease in either humans or animal models as noted by bolded text. Finally, we performed an in silico analysis to determine if each listed SNP had a described expression quantitative trait loci (eQTL) association in the EyeGEx27 database. We identified a number of established regulatory associations for our top SNPs as detailed in Table 2 within the retina, further supporting relevance to the ocular microenvironment.
rs2058019 contributes to genetic susceptibility for ROP in diverse populations
Given established differences in clinical ROP risk based on race and ethnicity22,23, we analyzed the association of the lead GLI3 SNP, rs2058019, stratified by these variables. As seen in Table 3, the association between rs2058019 and ROP disease is most significant when considering all patients, although odds ratios and allele frequencies between cases and controls demonstrate that the degree and direction of association remain consistent for Caucasian and Hispanic subjects. Interestingly, the African American subjects do not appear to be contributing to this association. This latter finding does not reach significance in this sample and thus requires validation within a larger population. As noted in Supplemental Table 3, we did not find suggestive evidence for significant differences is MAF or direction of effect within racial or ethnic populations for other top SNPs.
Multiple significant SNPs are identified regionally related to GLI3
To determine the regional significance for the GLI3 pan-locus, we plotted all SNP-associations within 100 Kb of the GLI3 gene for all patients as well as for racial and ethnic subsets. As pictured in Figure 2, a regional plot with the Y axis depicting the -log transformed p value and the X axis showing chromosomal position, we illustrate the GLI3 gene regional association for all ethnic and racial groups with ROP severity. The lead SNP rs2058019 was identified as a top-associated SNP within the pan-locus for the Caucasian and Hispanic groups and both the best and sentinel SNP when all patients were considered together. Significant regional SNPs for each group are delineated fully in Supplemental Table 4. Also pictured, we determined the linkage disequilibrium (LD) for each SNP, defined by the r2 using the included color scheme. LD was determined between SNPs and genes involved/close to SNPs identified.
GWAS identifies ROP-associated SNPs with genome-wide significance or suggestive significance correlating with both novel and established ROP risk genes
As noted in Table 2, we identified 10 loci which demonstrate genome-wide significance (p≤5×10-8) or suggestive significance (p≤5×10-6) of association with ROP severity, defined by stage 3 disease or greater, both with and without correction for birth weight (BW) and gestational age (GA). The position, effect allele and frequency, odds ratio, p value and nearest gene is delineated in Table 2. SNPs demonstrating the lowest p value in each locus/region were selected and the bioinformatically determined nearest gene was used. Under the additive genetic model, the ROP risk corresponds to the number of copies of the effect allele when the odds ratio is greater than 1. As noted, SNPs rs2058019 (GLI3 gene) and rs11563856 (CLDN12 gene) both reach genome wide significance, although only rs2058019 remains significant at this level when controlling for the potentially confounding effects of BW and GA. In two of the loci meeting suggestive significance, the nearest gene has a prior association with ROP disease in either humans or animal models as noted by bolded text. Finally, we performed an in silico analysis to determine if each listed SNP had a described expression quantitative trait loci (eQTL) association in the EyeGEx27 database. We identified a number of established regulatory associations for our top SNPs as detailed in Table 2 within the retina, further supporting relevance to the ocular microenvironment.
rs2058019 contributes to genetic susceptibility for ROP in diverse populations
Given established differences in clinical ROP risk based on race and ethnicity22,23, we analyzed the association of the lead GLI3 SNP, rs2058019, stratified by these variables. As seen in Table 3, the association between rs2058019 and ROP disease is most significant when considering all patients, although odds ratios and allele frequencies between cases and controls demonstrate that the degree and direction of association remain consistent for Caucasian and Hispanic subjects. Interestingly, the African American subjects do not appear to be contributing to this association. This latter finding does not reach significance in this sample and thus requires validation within a larger population. As noted in Supplemental Table 3, we did not find suggestive evidence for significant differences is MAF or direction of effect within racial or ethnic populations for other top SNPs.
Multiple significant SNPs are identified regionally related to GLI3
To determine the regional significance for the GLI3 pan-locus, we plotted all SNP-associations within 100 Kb of the GLI3 gene for all patients as well as for racial and ethnic subsets. As pictured in Figure 2, a regional plot with the Y axis depicting the -log transformed p value and the X axis showing chromosomal position, we illustrate the GLI3 gene regional association for all ethnic and racial groups with ROP severity. The lead SNP rs2058019 was identified as a top-associated SNP within the pan-locus for the Caucasian and Hispanic groups and both the best and sentinel SNP when all patients were considered together. Significant regional SNPs for each group are delineated fully in Supplemental Table 4. Also pictured, we determined the linkage disequilibrium (LD) for each SNP, defined by the r2 using the included color scheme. LD was determined between SNPs and genes involved/close to SNPs identified.
Top-identified SNP associations vary by race and ethnicity
To further determine race and ethnicity-based differences in SNP associations with ROP severity, we performed our GWAS analysis separately for each racial and ethnic group. As pictured in Supplemental Figure 2, which visually represents SNP associations for each population, racial and ethnic differences are present. As seen in Table 4, we find several SNP associations which are racially or ethnically specific and reach genome-wide or suggestive level significance. Notably, rs9978278 (CLDN14 gene) and rs74048122 reach genome-wide significance, which remains significant after correction for BW and GA. Although the number of observations is decreased when analyzing by race and ethnicity, these findings suggest possible differences in genetic risk for ROP by population ancestry.
SNPs with suggestive or greater association with ROP severity in the full iROP cohort demonstrate cross-significance within the Hispanic Diabetic Retinopathy GOLDR cohort
Our study design, which identifies SNPs associated with ROP disease stage 3 or greater, enriches our findings for genetic variation associated with pre-retinal neovascular pathobiology. Diabetic retinopathy (DR) is another form of retinal disease characterized by pre-retinal neovascularization. Therefore, to validate the significance of GLI3 genetic variation to pre-retinal neovascular disease, we performed an extension analysis of top-associated iROP SNPs, those with p ≤5×10-6 or greater significance levels, and SNPs regionally related for DR association within the Genetics of Latino Diabetic Retinopathy (GOLDR) GWAS dataset. Importantly, the GOLDR population consists of Hispanic patients with diabetic retinopathy (DR)28,29. As noted in Table 5, we found cross-significance for a number of SNPs with DR in GOLDR and with ROP severity in our iROP GWAS cohort. This included SNPs within GLI3, DCLK1, SP4, PTPRD,RPL30 and RIDA.
ROP Genetic Risk Score (GRS) demonstrates increased significance over single SNP associations
Joint estimation of SNP effects has been suggested as a superior approach to improve SNP-based disease prediction and risk stratification, including in eye disease by our group30–35. We calculated GRSs (Score A, Score B) incorporating SNPs based on the SNP selection by LASSO method to determine if combined SNP profiles were more highly associated with ROP severity36. ROP severity was represented as an ordinal categorial variable denoting stage including none, stage 1, stage 2, etc for this analysis. Table 6 compares the association results between different GRSs. Score A consists of the top GLI3 SNPs as well as 45 other top ROP-associated SNPs and Score B consists of 255 top-ROP associated SNPs. As depicted in Table 6, each GRS was significantly associated with ROP severity. To determine the proportion of variation within the phenotype which can be explained using each risk score, and therefore the breadth of ROP phenotype that is associated with each GRS, we analyzed adjusted R-squared. As noted, score A accounted for a larger adjusted R-squared than the top SNP alone (0.63 versus 0.34), suggesting that the additional SNPs in this risk score together account for a more variance in ROP phenotype/severity than the GLI3 SNP alone. The drop in adjusted R-squared between Score A (0.63) and Score B (0.45) is likely due to the addition of noise in the model resulting from adding SNPs in this score and over-fitting. Taken together, the GRS demonstrate stronger associations with all ROP phenotypes.
Significant retinal vascular SNPs demonstrate relevance to ROP risk
To determine the relevance of our dataset to vascular pathology more broadly, we analyzed the association of significant retinal vascular SNPs, identified in the central retinal venule equivalent (CRVE) and the central retinal arteriole equivalent (CRAE) dataset37, within our iROP dataset. Using the summary statistics from the exome chip meta-analysis for CRVE and CRAE, we tested all CRVE/CRAE loci with p ≤ 0.01 for ROP association within our iROP dataset. A total of 583 SNPs from the CRVE/CRAE dataset were available for analysis in our iROP dataset. After Bonferroni correction for multiple testing, we identified significant associations for SNPS rs13079478 (FYCO1 gene), rs33910087 (FYCO1 gene) and rs12357206 (ANK3 gene) with ROP severity as noted in Table 7.
Extension of associations for severe ROP SNPs identified in a candidate gene study replicate significance of the RELN gene
To independently replicate previously identified ROP-associated SNPs, we evaluated candidate SNPs found to be significantly associated with severe ROP by Harnett et al.17 within our iROP GWAS dataset. As noted in Table 8, rs10251365 in the gene RELN is significantly associated with ROP severity in both cohorts (p=0.009). The p values of SNP-associations with ROP severity based on stage using our data are included for comparisons and confirmation.
GLI3 and other top associated genes are expressed in human retinal and retinal pigment epithelial tissues (RPE).
To determine the relevance of GLI3 genetic variation to retinal pathophysiology, we sought to determine the expression patterns of GLI3 within human retinal and RPE tissues. Human donor macular or peripheral retinal and RPE tissues from control eyes (n=10), average age 72 years, were analyzed using RNA-sequencing. Expression was analyzed using DESeq238 as we have previously published39. Analyses were done for genes relative to our top-associated ROP SNPs as listed in Table 2. All top genes except PRPF4B were expressed in human donor macular tissues; the tissue type with greatest expression and fold difference between tissue types (neurosensory retina or RPE) is listed in Table 9. As noted, GLI3 was expressed in both the neurosensory retina and RPE; macular RPE expression was significantly greater by 2.54-fold (p=3.3 x 10 -15, after correction for multiple testing using Benjamini-Hockberg) than neurosensory retinal expression (Table 9). We also analyzed differential GLI3 expression between macular and peripheral RPE tissues to determine the region with greatest expression. Interestingly, we found that GLI3 expression was 1.72 fold higher in the peripheral RPE as compared to the macular RPE (p<0.01) which is notable given the presence of ROP most commonly within the peripheral retina.