Analysis of Genetically Determined Gene Expression Suggests Role of Inflammatory Processes in Etiology of Exfoliation Syndrome

Exfoliation syndrome (XFS) is an age-related systemic disorder characterized by excessive production and progressive accumulation of abnormal extracellular material, with pathognomonic ocular manifestations. It is the most common cause of secondary glaucoma, resulting in widespread global blindness. We performed Transcriptomic Wide Association Studies using PrediXcan models trained in 48 GTEx tissues to identify genetically-determined gene expression changes associated with XFS risk, leveraging on results from a global GWAS that included 123,457 individuals from 24 countries. Twenty-eight genes in the chr15q22-25 region showed statistically significant associations in both single-tissue and multi-tissue analysis. Reduced models for these genes that excluded variants in LD with the known LOXL1 signal showed genome-wide significant association signals in nine genes, independently of LOXL1 . Out of these ten genes, mRNA transcript levels for ARID3B , CD276 , LOXL1 , NEO1 , SCAMP2 , and UBL7 were significantly decreased in iris tissues from XFS patients compared to control samples. Genes with genetically determined expression changes in XFS were significantly enriched for genes associated with inflammatory conditions. We further explored the health consequences of high susceptibility to XFS using a large electronic health record and observed a higher incidence of XFS comorbidity with inflammatory and connective tissue diseases. Our results implicate a role for connective tissues and inflammation in the etiology of XFS. Targeting the inflammatory pathway may be a potential therapeutic option to reduce progression in XFS.


Introduction
Exfoliation syndrome (XFS) is an age-related systemic disorder characterized by excessive production and progressive accumulation of abnormal extracellular material, with pathognomonic ocular manifestations. 1,2 It is the most common cause of secondary glaucoma, resulting in widespread global blindness. 3 In addition to ocular manifestations, exfoliation syndrome deposits have been observed in visceral organs, such as the lung, kidney, liver and gallbladder. 2,4 In addition to elastic tissue disorders, XFS has also been associated with increased risk of vascular diseases. [5][6][7] Associations of XFS to several systemic biomarkers of inflammation, including complement components and homocysteine, have also been reported. 3,8,9 Genetic mechanisms have substantial influence on XFS etiology as evidenced in family and twin studies. 10,11 There have been eight genome-wide association studies (GWAS) of XFS, 7,12-18 three of which include meta-analysis, 7,12,13 that have cumulatively identified >60 associated genetic variants. The largest meta-analysis of XFS involved >123,000 individuals (13,620 XFS cases, 109,837 controls) from 24 countries across six continents and identified seven loci with the strongest association signal in chromosome 15 near the lysyl oxidase-like 1 gene (LOXL1). The signal on chr15 involved 54 potential causal variants. Overall, (i) two missense variants in LOXL1, rs1048661 (encoding LOXL1 p.Leu141Arg) and rs3825942 (p.Gly153Asp), are likely to confer risk of developing XFS, with very high heterogeneity across populations because the alleles show an effect reversal, 12,13,16,[19][20][21][22] (ii) the associated variants in the locus showed population-specific frequency and linkage disequilibrium (LD) patterns, 12,16,21 (iii) haplotypes that carry the risk alleles depending on the population are correlated with reduced LOXL1 expression levels, however, (iv) no clear functional effects for the haplotypes that represent the two variants have been shown. 7,13,23,24 The non-coding variants associated with 4 XFS at this chr15 locus could confer regulatory effects. Some of these non-coding variants regulate expressions of the sentinel LOXL1 and the neighboring STRA6 gene. 7,25,26 After considering all the reports on genetic architecture of XFS to date, we hypothesize that analysis of the contribution of the genetically-determined component of gene expression to XFS risk can provide a powerful method to elucidate genes involved in XFS. We used a genebased method, PrediXcan 27 , implemented on GWAS summary statistics (Summary PrediXcan; S-PrediXcan) 28 to identify genetically determined gene expression traits associated with disease risk. Models were trained on 48 GTEx tissues to estimate the correlation between geneticallydetermined gene expression and XFS risk, leveraging on GWAS summary statistics from a previously reported multi-ethnic study. 13 We also analyzed individual-level GWAS data from four additional European ancestry populations, two German, one Italian 7 and one American. 27 We performed extensive statistical analysis, followed by functional validation in human iris tissues of the prioritized top gene-level associations. Finally, to gain clinical insights into our findings, we explored the health consequences to individuals carrying high XFS genetic risk in a large biobank with links to electronic health records.

PrediXcan Analysis
We performed single-tissue PrediXcan analysis of the global multi-ethnic GWAS (13,620 XFS cases and 109,837 controls) summary data, identifying 23 genes (defined as signals with  Table   S2).
Seven additional genes located on chromosomes 1 (LGR6 p=2.20E-06; SDHB p=8.07E-08), 6 (PRRT1 p=9.10E-07), 8 (PRSS55 p=4.18E-13), 10 (CDH23 p=1.86E-07; PITRM1 p=8.45E-12) and 19 (CALM3 p=2.60E-07), were significantly associated in the multi-tissues analysis (Figure 1a, Suppl. Table S3). All seven signals mapped to genomic regions harboring GWAS SNP variants showing subgenome-wide significance with XFS risk, except for PRRT1, which corresponds to the AGPAT1 locus 13 The data indicates that combining information across variants in genes and then across tissue expression improves the power to identify additional XFS-associated loci. 6 To ensure that the association observed at the 23 genes from the larger multi-ethnic dataset was not an artefact of population structure, we confirmed the signals in a subset of European ancestry individuals (Materials and Methods). Twelve out of the 23 genes in chr15q22-25 region (ten in the single-tissue analysis and two in the multi-tissue analysis) remained genome-wide significant in this European ancestry analysis, whilst the remaining 11 genes remain nominally associated (p<0.05) (Suppl. Figure S3, Suppl. Table S4, S5, S6).

Correlated Expression Among Significant Genes
To determine whether the 23 observed association gene signals were artefacts of LD contamination, we performed extensive additional analyses. We calculated the pair-wise correlation in measured expression among the significant genes, using the reference GTEx panel.
We checked the relationship between expression correlation for each of the chr15p22-25 genes with LOXL1 and STOML1 and the PrediXcan associations for the two genes in each tissue. We made two important observations from this analysis. First, there was a significant correlation between the correlation of measured gene expression of the other genes in chr15p22-25 with LOXL1 or STOML1 and the gene-level associations with XFS in most tissues (Suppl. Table S7a, Suppl. Figure S4a). Secondly, there is substantial correlation between STOML1 and LOXL1 (r 2 = 0.67, p=0.009) (Suppl. Table S7a, Suppl. Figure S4a). These results indicate that the associations by one of the genes might be due to LD contamination or the presence of shared variants in the prediction models of the two genes ( Table 2).
To dissect the potential source of LD contamination in the PrediXcan analysis, we looked into the effect of the two GWAS missense variants implicated in XFS that have mostly been linked to LOXL1 and shown to play regulatory roles 23 , followed by the effect of LOXL1 and 7 STOML1 signals on chr15q22-25 region observed associations in each tissue. We also determined the effect of shared variants between prediction models for the genes in the region.
We modified our prediction models by excluding: i) rs3825942 missense variant, ii) rs4886776 intronic variant, which is at near perfect LD (pair-wise, r 2 =0.982) with the rs1048661 missense variant, and iii) all the variants in our gene models that were in LD with the two variants (r 2 > 0.1). The two missense variants had wide ranging effect on the genetically predicted expression of many chr15q22-25 region genes, with the largest effect on LOXL1. The strength of the association signals diminished in six of the nine tissues for which we had the gene's predicted expression. Association signals in three of these tissues fell below genome-wide threshold in the global dataset (Suppl. Table S7b). In addition, association signals for seven additional genes in the region besides STOML1 lost genome-wide significance; CD276 (2 tissues), COX5A, CYP1A1, LMAN1L, MPI, SCAMP2 and TBC1D21.
Interestingly, association signals for eight genes were strengthened, four of which  Table S7g). 29 Our results indicate that the missense variants have enhancing or diminishing effects on the PrediXcan association signals, in chr15q22-25, with XFS, consistent with allele reversal reported for the GWAS variants. 13 8 To check whether the association signals in the chr15q22-25 region for each tissue were independent of the 'sentinel' LOXL1 signal, we excluded, from the prediction models, variants that were in LD (r 2 >0.1) with any variants in LOXL1 or STOML1 tissue models. We also excluded variants that were shared between two or more genes in their original prediction models. Seven of the genes that were associated with XFS at genome-wide threshold in their original models showed diminished signals, including four below significance levels: UBL7, ISLR, LMAN1L and COX5A in reduced models (Suppl. Table S7e). Association signals for CYP1A1 and CYP1A2 were slightly diminished in reduced models relative to the original models, but remained at significant genome-wide thresholds (Suppl. Table S7e). However, association signals for six genes strengthened, four of which attained genome-wide association significance levels in the reduced models: INSYN1, CLK3, CYP1A1 and NEO1 (Suppl. Table   S7e). These shifts in association strength seem to be due to few variants that are either in LD with variants in LOXL1 and STOML1 models or are shared with other genes' models (Suppl. Tables S7e, S7f). However, these variants causing the shifts in association signals upon exclusion from the models, were not in LD with the missense rs3825942 variant (Suppl. Table   S7g). This indicates that there are signals of allele reversal independent of the known missense variants in LOXL1.
Excluding variants that are in LD with SNPs in the LOXL1/STOML1 models did not have any effect on the association signals for six genes that were associated with XFS at genome-wide threshold in the original models: CSK, STRA6, CD276, ARID3B, MPI & TBC1D21, with the latter three in testis, for which we had no models for both LOXL1 and STOML1 (Suppl. Table   S7e). The results indicate that some of the observed signals were artefacts of LD contamination from LOXL1 and STOML1 (ISLR, LMAN1L and COX5A), while some of the signals were 9 masked in the original models (INSYN1, CLK3, CYP1A1 and NEO1). There was inconsistent result for UBL7, where there was no effect in its association signal in a tissue, enhanced effect in another tissue, and diminished signal in two other tissues, one of which went below the genomewide threshold, albeit the reduced model had only a single variant in the prediction (Suppl. Table S7e).

Conditional Analysis
Conditional analysis was performed in tissues with any genome-wide significant chr15p22-25 region gene signals against the predicted gene expression for the strongest observed signals in the European subset. As in the global dataset, the strongest signals in the European dataset were at LOXL1 (Table 1, Suppl. Figure S3). In all nine of the 48 tissues with LOXL1 predicted expression, only the STOML1 gene showed a significant association signal (in addition to LOXL1) ( Table 1, Suppl. Figure S3). After conditioning on LOXL1 in these tissues, the STOML1 signal disappeared, but association signals at SCAMP2 and INSYN1 were observed in artery-aorta and lung tissues, respectively (Table 2, Figure 2a). This indicated that the association of STOML1 with XFS is an artefact of a strong LOXL1 signal, consistent with LOXL1 being the true signal and STOML1 a proxy signal. In addition, association signals for SCAMP2 and INSYN1 were masked by the LOXL1 signal.
In 17 tissues with STOML1 predicted gene expression, we observed significant association signals for 8 other genes (in addition to STOML1) ( Table 2 Figure S5).
However, INSYN1 has enhanced expression in brain tissues. 31,32 Collectively, these results suggest that some of the identified gene-level association signals between XFS and genetically imputed expression were driven by correlation to the strong LOXL1 and its "proxy" STOML1 signal.

Enrichment and pathway analysis
Genes at genome-wide significance (p<2.02e-7) and nominal significance (p<0.05) were evaluated for enrichment of known pathways, using Enrichr. 33,34 Genes at genome-wide significance were enriched for genes reported for, or mapped to, GWAS variants implicated in several caffeine-related (coffee and caffeine consumption, and caffeine metabolism 35-37 ) and blood pressure 38 traits. The enrichment for coffee consumption is replicated for the larger gene set that is associated with XFS at nominal significance. 39 Some of these genes, CYP1A1 and CYP1A2, 40 are involved in fatty acid oxidation and estrogen receptor pathways. In addition, these two genes are also observed in the Reactome enrichment of protectin synthesis (Table 3, Suppl.

Table S8).
Our gene set is also enriched for genes associated with carcinoma and three inflammatory conditions: rheumatoid arthritis, Type 1 diabetes, vitiligo in Jensen Diseases, a database that integrates evidence on disease-gene associations from automatic text mining, manually curated literature, cancer mutation data, and GWAS (https://diseases.jensenlab.org/). We further analyzed our gene list against compounds in Drug Signatures Database (DSigDB, http://tanlab.ucdenver.edu/DSigDB), a gene set resource that relates drugs/compounds and their target genes. Our gene set is enriched for genes that are targets of cyclosporin A (p=9.66E-11), and genes that are targets for compounds that are either 1) carcinogenic: Aflatoxin B1, potassium chromate, methyl methanesulfonate and copper sulfate, 2) neuroactive: valproic acid and methamphetamine, 3) neuroprotective: quercetin and epigallocatechin gallate, or 4) analgesic: acetaminophen ( Table 3, Suppl. Table S8). Cyclosporin A is an immunosuppressant taken to treat rheumatoid arthritis and other autoimmune conditions, while quercetin and acetaminophen have been shown to have anti-inflammatory effects. 41,42 Analysis in Gene Set Enrichment Analysis (GSEA) using a ranked association gene list based on effect sizes confirmed some of the enrichment observations using Enrichr. GSEA besides replicating enrichment for acetaminophen, showed enrichment for: 1) six synthetic estrogens, 2) estrogen regulators (Clomifene), 3) antiarrhythmic (quinidine), and 4) an antifungal (ketaconazole) ( Table 3, Suppl. Table S8). The gene set was also enriched for genes that were associated with the collagen fibril crosslinking (FDR=0.0313) Reactome pathway.
Analysis of the gene list in relation to the latest Reactome library (https://reactome.org/) returned significant enrichment for the endosomal-vacuolar pathway (p=8.14E-11), an enrichment that was replicated in gene sets that were predicted to be downregulated (p=3.24E-8). Our results broadly recapitulated results above, even after excluding genes in HLA and chr17 inversion regions from the enrichment analysis of the gene set (p<0.05) ( Table 3, Suppl. Table S8)

Quantitative Expression Validation Analysis
Expression levels of ARID3B, CD276, INSYN1, LOXL1, NEO1, SCAMP2, STOML1 and UBL7 were measured in XFS and control eye tissues. All transcript levels were found to be decreased in iris tissues obtained from XFS patients compared to control samples, with significant differences for ARID3B, CD276, LOXL1, NEO1, SCAMP2 and UBL7 (p<0.05) ( Figure 3). INSYN1 and STOML1 were not significantly downregulated in diseased eyes relative to normal eyes in validation analysis. STOML1 is the closest gene to and potentially proxy for LOXL1 among those that show association in our PrediXcan results within the chr15q22-25 region. We included it as a negative control in the validation analysis, while LOXL1 was a positive control considering that it had already been shown to exhibit pattern of downregulation in gene expression in diseased relative to normal tissues. 13 CD276 was selected for functional validation in eye tissue despite no significant association with XFS in single-tissue analysis in the European ancestry data because it was significantly associated with XFS in multi-tissue analysis in European data. In addition, it was one of the gene association signals which were not affected by excluding variants that were in LD with LOXL1/STOML1 model SNPs in the multiethnic global dataset. Overall, our validation results replicate the associations found using the genetically determined gene expression.

Comorbidity/pleiotropy analysis
To gain further biological insights into the gene associations we observed in our PrediXcan analysis, we performed logistic regression analysis of both XFS ICD9/10 diagnosis and Polygenic Risk Score generated from the multi-ethnic summary data across the BioVU individuals, Vanderbilt University's electronic health records database linked to genetic 13 information, as the target dataset (Materials and Methods). XFS diagnosis was associated with an increased risk of 96 phenotypes in BioVU, including 12 musculoskeletal phenotypes, 4 infectious diseases, and 1 cardiovascular phenotype. These results are consistent with higher comorbidity of diseases affecting inflammation, connective tissue, and the circulatory system in individuals with XFS (Suppl. Table S9a, Suppl. Figure S6a).
XFS polygenic score was not significantly associated with any phenotypes in the analysis (Suppl. Figure S6b). This potentially indicated that PRS generated from the global multi-ethnic GWAS summary might not be powered to detect association with traits in the EHR, and we might require scores from a more homogeneous and a much larger sample size. However, among the top PRS associations, we found several inflammatory diseases (Suppl. Table S9b), consistent with the enrichment results reported above.

Discussion
We performed gene-based association analysis using GWAS summary statistics and conducted extensive experimental validation of genes associated with XFS. From our PrediXcan analysis, we identified 35 associated genes with XFS, 23 in single-tissue analysis and the rest in multi-tissue analysis. To eliminate the possibility of false-positive results due to LD contamination, we performed extensive additional analyses. First, we performed PrediXcan analysis in reduced models removing variants in LD with the two LOXL1 missense variants associated with XFS, and variants in LOXL1/STOML1 models in both global multiethnic and a subset of European ancestry individuals. Secondly, we conducted conditional analysis of the significant signals in European ancestry individuals. Thirdly, we then filtered signals based on correlated gene expression, LD and shared eQTLs and confirmed thirteen genes to be associated with XFS. Finally, expression analysis in human iris tissues further confirmed six of these seven signals, which were significantly downregulated in diseased XFS relative to normal eye tissues; ARID3B, CD276, LOXL1, NEO1, SCAMP2 and UBL7.
Our results suggest potentially substantial roles of inflammation and environment in the etiology of XFS. All of the six genes prioritized here by prediction and extensive validation analyses have inflammatory roles. ARID3B, CD276, LOXL1 and NEO1 are immunoregulatory molecules involved in the interaction between different tumors and the immune system. [43][44][45][46] SCAMP2 is important in granule exocytosis, a process crucial in membrane fusion in normal cellular functions in diverse systems including the immune system's inflammatory response. [47][48][49] CD276 is involved in regulation of Ag-specific T cell-mediated immune responses and participates in the innate immunity-associated inflammatory response. 50,51 LOXL1 has also been implicated in fibrosis in response to inflammation in human breast cancer, 52 in liver and lungs in 15 model animals. [53][54][55] UBL7 encodes a member of the ubiquitin protein family, that is crucial in immune response and regulation of inflammatory response. [56][57][58] Genes that show significant association of predicted expression with XFS at nominal significance are enriched for genes associated with three inflammatory conditions: rheumatoid arthritis, Type 1 diabetes and vitiligo in the Jensen Diseases database, with genes associated with the former two conditions enriched even with HLA region excluded. This is also consistent with the enrichment we find in DSigDB and DrugBank for cyclosporin A, acetaminophen and quercetin, which are compounds that have anti-inflammatory effects. 59 Enrichment of predicted genes in this study in the polyunsaturated fatty acid (PUFA) and steroid derivatives: protectin (Reactome), omega fatty acid and estrogen (WikiPathways) are also consistent with the potential role of inflammation in XFS. Protectin, a derivative of PUFA including Omega-3 that are major components of fish oil, has an anti-inflammatory, antiamyloidogenic, and anti-apoptotic activities in human neural cells. [60][61][62] Omega fatty acid has been suggested as an IOP reducing supplements 63-65 because of its anti-inflammatory effects. 66 Association of steroid derivative, estrogen with glaucoma has been previously explored with higher levels of estrogen in reduction in IOP and conferring a possible reduced risk of glaucoma. 67 The synthetic form of estrogen, estradiol, has been shown in a rat glaucoma model to inhibit optic nerve axonal degeneration by inducing a protein that is crucial in protecting RGC from oxidative damage. 68,69 The association with inflammation is consistent with studies in limited numbers of XFS patients that found elevated inflammatory markers relative to controls, including cytokines, and markers such as interleukin-6 (IL-6) and IL-8, 70,71 tumor necrosis factor-α (TNF-α) and YKL-40. 72,73 However, there are conflicting results for high sensitivity C-reactive protein. 74,75 In addition, the XFS gene sets are enriched for genes that map to variants implicated in coffee and caffeine intake. Effects of caffeine consumption in the etiology of XFS have been studied, on the premise that coffee consumption increases plasma homocysteine levels that are speculated to enhance XFS material formation by contributing to vascular damage, oxidative stress, and extracellular matrix alterations. [76][77][78][79] Consumption of coffee has been reported to have both pro-and anti-inflammatory effects. 80 However, review of fifteen studies on the effect of coffee and caffeine on inflammation inferred the former had anti-inflammatory action, while the latter had complex effects on the inflammatory response with both proinflammatory and antiinflammatory responses reported. 81 Caffeine might have a neuroprotective role by regulating pathways that produce inflammatory molecules via adenosine receptors in brain cells. 82,83 Posttranscriptional regulation of LOXL1 gene expression has been also shown to be modulated by caffeine. 84 Globally, our results of the six novel functionally validated genes also confirm the role of Cylosporin_A regulate lysyl oxidase expression and collagen metabolism probably by inhibiting an isomerase involved in protein folding. [88][89][90] Other anti-inflammatory compounds identified from our enrichment analysis in the current study, epigallocatechin gallate, valproic acid, quercetin, ketoconazole and acetaminophen have also been shown to suppress collagen and/or are anti-fibrotic in variety of tissues by yet to be elucidated mechanism. [91][92][93][94][95] Moreover, coffee and caffeine inhibit collagen expression and deposition, and have anti-fibrotic effects by blocking expressions and/or by modulating effects of profibrotic factors. [96][97][98][99] Our results that show enrichment in crosslinking of collagen fibrils, a crucial constituent

Limitations of the study
This study has two main limitations. First, despite the fact that GTEx data for the 48 tissues represent the most comprehensive eQTL data set of human tissues, it does not constitute a complete representation of all human tissues and may fail to identify the real causal genes in the unsampled ocular tissue. However, we have confirmed from an ocular tissue database that novel signals identified in this study are robustly expressed in XFS relevant eye tissues. Moreover, recent analysis shows that the majority of the human body tissues exhibit higher degrees of tissue similarities. 110 In addition, it has been shown that most complex conditions, including XFS, might actually manifest in many diverse tissues in the body. 110 Second, only a third of the signals identified in the larger data were robustly confirmed in a European dataset at genome-wide significance. This raised the possibility that most of the initial signals identified an artefact of local LD leakage or shared eQTLs with the sentinel LOXL1/STOML1 signal. Using statistical validation with reduced models including no SNPs in LD with sentinel variants, we confirmed associations independent of LOXL1 for at least ten genes including seven that were experimentally validated. In addition, results from a recent study are consistent with two other association gene signals confirmed using multi-tissue analysis of European dataset and PrediXcan of reduced models in multi-ethnic global data, ISLR2 and STRA6. 25  shown to control critical checkpoints in inflammation and to promote an inflammatory environment. [111][112][113] In summary, our analysis of predicted gene expression and extensive functional analysis in eye tissue prioritized six genes in association with XFS. Our results further confirmed the role of connective tissues and highlighted the importance of inflammation in the etiology of XFS.
Thus, molecular elements that underlie the interaction of connective tissue biosynthesis and inflammatory pathways may play a central role in the etiology of XFS.

Materials and Methods
We used an extension of PrediXcan 28 that uses GWAS summary statistics, S-PrediXcan, 11 to analyze GWAS summary statistical data from a multi-ethnic GWAS study on XFS. 13 This dataset consisted of 13,620 XFS cases and 109,837 controls. We also performed PrediXcan on individual-level genetic data from two independent datasets comprising 4127 cases and 9075 controls. The first dataset comprised case and control samples from three cohorts of

Statistical Analysis
We used the gene-based method, PrediXcan, that provides a framework for correlating imputed gene expression with phenotype. 9 Gene expression prediction models for 48 different human tissues were trained using GTEx v7 data, subsampled to use only the European ancestry samples. Models with non-zero weights that met a set significance criterion (r > 0.10, q < 0.05) were retained. 27 Given the lack of eye tissue in the GTEx data, we performed PrediXcan analysis in all available tissues to leverage the shared regulatory architecture of gene expression across tissues. 114 We referred to the association analysis in each tissue between predicted expression and XFS as "single-tissue analysis." Because XFS is considered a systemic disorder, we also aggregated evidence across the different tissues to improve our ability to prioritize genes relative to a single unrelated tissue. We determined the joint effects of gene expression variation predicted across all 48 tissues using the Multi-Tissue PrediXcan (MultiXcan), a multivariate regression method that integrates evidence across multiple tissues taking into account the correlation between the tissues. 115,116 We refer to this association analysis as "multi-tissue analysis." We used S-PrediXcan 28 to analyze all GWAS summary statistic data from the multiethnic study of Aung, et al. 13 . Since the summary-based method has been shown to be conservative and tends to underestimate significance in cases where there is some linkage disequilibrium-structure mismatch between reference and study cohorts, 28 we retained and reported S-PrediXcan results that had a univariate S-PrediXcan P<0.0001. We used Bonferroni adjustment for multiple hypothesis testing. Genome-wide significance for a gene-level association in single-tissue and multi-tissue PrediXcan analysis were defined as p<2.02e-7 and p<3.02e-6, respectively.

Conditional Analysis and Linkage Disequilibrium Evaluation
To determine whether multiple association signals within the same locus are due to independent causal genes or statistical artefacts of correlation in measured expression and predicted gene expression for adjacent genes, 117 we examined the correlation in the gene expression among genome-wide significant genes in the reference GTEx data. We assumed that there is concordance in correlation in measured and predicted gene expressions, but depending on the quality of our predictions, correlation in predicted expression for a pair of genes may be 22 missed. We verified the extent of LD in the 1000 genomes database 118,119 between variants in the prediction models for significantly associated genes in each tissue.
To measure potential regulatory effects of the two classical LOXL1 missense variants in our PrediXcan analysis, we excluded them and all the variants in our gene models that were in LD with them (defined as pairwise r 2 > 0.1) to generate "reduced models". We predicted gene expression and performed association analysis using reduced models in both the global multiethnic and the European subset for the genes in chromosome 15 region. To assess whether the association signals in the chr15q22-25 region for each tissue are independent of the 'classical' LOXL1 signal, we excluded variants in the prediction models of genes in the region that were in LD (pairwise r 2 > 0.1) with any variant in the LOXL1 model. In tissues without a LOXL1 model (i.e., r 2 > 0.10, q < 0.05), we excluded variants for chr15q22-25 region genes that were in LD (r2>0.1) with variants in the STOML1 models. In addition, we excluded variants that were shared between prediction models for genes in the region. In each case, we performed association analysis using the reduced models and compared the results with the original models.
To determine whether additional genes within the region were significantly associated with XFS, independently of the most highly associated genes (LOXL1 and STOML1) identified in the primary analysis, we performed conditional analysis using the actual individual-level genotype data that included our BioVU cohort and a subset of Aung, et al. consisting of three European ancestry cohorts. For each tissue with a significant association, the conditional analysis was performed on the gene that was the most statistically significant as identified from the initial PrediXcan analysis. We generated genetically determined expression for each individual in the dataset and then performed association analysis using Genetic Association Analysis Under Complex Survey Sampling (SUGEN: version 8.8) 120 on the individual imputed gene expression data, including age, sex, first 5 principal components and relatedness in the regression model. A new logistic regression model was then fit to the case-control data by sequentially adjusting for the expression data of the top significant signals as a covariate. We then performed a metaanalysis for the PrediXcan summary statistics from the four datasets. We repeated this procedure until no genes in the region attained our threshold for statistical significance in the tissues tested (<0.05/total # of e-genes x # of tissues tested for each top round of tests).

Enrichment and pathway analysis
Genes that were predicted to be associated with XFS at genome-wide significance in both single-tissue and multi-tissue analysis, and at nominal significance (p<0.05) in single-tissue analysis were checked for enrichment of particular categories in several databases using the webbased enrichment tools, Enrichr. 33,34 This was done by using the strongest signal at nominal significance across the 48 tissues for each of the genes analyzed in PrediXcan. Enrichr implements Fisher's exact test and uses over 100 gene set libraries to compute enrichment. 33  including an on-column DNaseI digestion step using the RNase-free DNase Set (Qiagen). Firststrand cDNA synthesis and PCR reaction was performed as previously described. 128 Exonspanning primers (Eurofins Genomics, Ebersberg, Germany), designed with Primer 3 software (http://bioinfo.ut.ee/primer3/), are summarized in Suppl. Table S1. Quantitative real-time PCR was performed using the CFX Connect thermal cycler and software (Bio-Rad Laboratories, München, Germany). Probes were run in parallel and analysed with the ∆∆Ct method. Averaged data represent at least three biological replicates. Unique binding was determined with UCSC BLAST search (https://genome.ucsc.edu/) and amplification specificity was checked using melt curve, agarose gel and sequence analyses with the Prism 3100 DNA-sequencer (Applied Biosystems, Foster City, CA). For normalization of gene expression levels, mRNA ratios relative to the house-keeping gene GAPDH were calculated.
Group comparisons were performed using a Mann-Whitney U test using SPSS v.20 software (IBM, Ehningen, Germany). P < 0.05 was considered statistically significant.

Testing for Comorbidity/Pleiotropy
To determine the comprehensive health consequences of high genetic risk to XFS, we performed a phenome-wide association study (PheWAS). 129  BioVU with about 18k ICD-9 /ICD-10 codes, accounting for age, gender, and the first 5 principal components.

DISCLOSURES:
ERG receives an honorarium from the journal Circulation Research of the American Heart Association, as a member of the Editorial Board.
The remaining authors declare no competing interests.

ACKNOWLEDGMENTS:
We are grateful to Maria Niarchou and Tyne Fleming for comments on earlier version of the manuscript.
JH was jointly supported through grant to MAB (T32 grant 5T32EY021453), ERG and NJC.
ERG is grateful to the President and Fellows of Clare Hall, University of Cambridge for the fellowship support. ERG is also supported by a NIH Genomic Innovator Award (R35HG010718). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.   On the X axis is plot of variant/gene associations along the chromosomes, while Y axis represent the significance levels for the associations. The legend for PrediXcan analysis on the 48 GTEx tissues, a color for each tissue, is on the right. For both plots the blue dotted line is the "suggestive" genome-wide significant threshold (p<1e-4), while the red line is the genome-wide significant threshold. On the lower plot, the gene labels are for genes reported/mapped to genome-wide significant signals in GWAS result, while in the upper plot is for genes that are associated at "suggestive" significant threshold. b) genes in the region in chromosome 15 that show significant association. P values for the plot in Suppl. Table S2. C15orf59 is INSYN1. conditioned on STOML1 predicted gene expressions c) correlation in gene expression in for genes in chr15q22-25 in lung tissue for i) reference GTEx data ii) predicted gene expression in BioVU cohort. In each case on the X axis is plot of variant/gene associations along the chromosomes, while Y axis represent the significance levels for the associations. The legend for PrediXcan analysis on the GTEx tissues, a color for each tissue, is on the right. For both plots the blue dotted line is the "suggestive" genome-wide significant threshold (p<1e-4), while the red line is the genome-wide significant threshold. On the lower plot, the gene labels are for genes reported/mapped to genome-wide significant signals in GWAS result, while in the upper plot is for genes that are associated at "suggestive" significant threshold. P values for the plot in Suppl. Table S2. d) linkage disequilibrium between variants in prediction models for LOXL1 and neighboring genes in lung tissue based on pairwise r 2 . C15orf59 is INSYN1. Figure 3: Expression of NEO1, CD276, INSYN1, LOXL1, STOML1, UBL7, ARID3B and SCAMP2 mRNA in iris tissues derived from normal human donors (control) (n=19) and donors with XFS syndrome (n=12) using real-time PCR technology. Expression levels were reduced in XFS specimens compared to control specimens, with significant differences for NEO1, CD276, LOXL1, UBL7, ARID3B and SCAMP2. The relative expression levels were normalized relative to GAPDH and are represented as mean values ± SD (*p<0.05; **p<0.01, ***p<0.001). Table 2: Prioritization of genes that are associated with XFS: Each gene pair was checked if they had correlated predicted and expressed gene expressions, and LD checked for variants included in prediction models   C15orf59  TBC1D21  LOXL1  STOML1  PML  ISLR2  ISLR  STRA6  CCDC33  CYP11A1  SEMA7A  UBL7  ARID3B  CLK3  EDC3  CYP1A1  CYP1A2  CSK  LMAN1L  CPLX3  ULK3  SCAMP2  MPI  FAM219B  COX5A