2.1 GWAS summary data collection and IVs selection
We searched PubMed for GWASs of LTL. From among these studies, we chose the studies with the most citations in Mendelian randomization studies (up to 12.2020) and the most recent publication including the largest sample size. IVs identified from the former were IV-1, while the other was IV-2.
The IV-1 was derived from a meta-analysis based on 6 studies, including 9190 European individuals (aged 18-95) [11]. Telomere length was a continuous variable measured with the Southern blot method for the terminal restriction fragment, which is the current gold standard for LTL measurement [12, 13]. This is also the most recent and largest GWAS that adopted this method. During the meta-analysis, age, sex, and smoking were adjusted, and the mean telomere length was 6.83 ± 0.65 (kb) (mean ± SD) in this study.
IVs were collected from the method most used in Mendelian randomization focused on LTL, the details and the quality of which have been described by Haycock et al. and testified in large scale MR estimates for many times [3, 4, 14]. Briefly, they were SNPs associated with LTL at the genome-wide significance level, whose effects and standard errors on LTL were concatenated by Mangino et al. [11]. Sixteen single nucleotide polymorphisms (SNPs) within the range of 10 loci were included after excluding loci with obvious heterogeneity between studies, which could explain 9.4% of the genetic variation in LTL. The statistical F value was 18-28 for each SNP [11] (Supplementary Table 1). It is considered an indicator for strong instruments and the absence of bias from weak instruments when the statistical value is greater than 10 [15].
We obtained IV-2 from the most recent and largest GWAS of LTL, which included up to 78,592 European participants from the EPIC-InterAct, EPIC-CVD and ENGAGE Consortium. They measured the mean LTL as a continuous variable by quantitative PCR which expressed LTL as a ratio of the telomere repeat number (T) to a single-copy gene (S) [16]. The age of participants ranged from 18 to 106, which was adjusted in this GWAS as well as sex. Twenty SNPs were reported to be associated with ALS with P<5*10-8 corrected by FDR, which were capable of explaining 1%-2% of the genetic variation in LTL. The F statistic of each SNP ranges from 27 to 205 (Supplementary Table 1).
To acquire outcome data from the same race, we used summary data from the most recently published ALS GWAS that genotyped and imputed more than 10 million SNPs in up to 20,806 ALS cases and 59,804 controls. All the patients had onset of symptoms after age 18 years and were diagnosed at probable or definite levels according to the El Escorial criteria [17].
We selected independent SNPs as IVs with r2<0.001 and MAF>0.05. For SNPs that could not be found in the ALS GWAS summary data, we replaced them with proxy SNPs in strong linkage disequilibrium (LD) (r2>0.9) by searching the SNiPA website(http://snipa.helmholtz-muenchen.de/snipa3/). If a proxy SNP was not reported, the SNP was excluded from downstream MR analysis. Thus, we obtained 10 independent SNPs in IV-1 and 15 SNPs in IV-2. The brief procedures are shown in the flowchart. (Supplementary Fig. 1).
2.2 Two-sample MR
According to IV-1 and IV-2, we extracted information including the effect allele, the other allele, effect, standard error, and P value of the corresponding SNPs from ALS summary data. We harmonized the direction of SNP effects on LTL and ALS.
MR analysis is based on the following 3 assumptions: assumption 1, the selected genetic variations are significantly associated with exposure; assumption 2, the selected genetic variations are not associated with other confounders; and assumption 3, the selected genetic variations are significantly associated with the risk of outcome only through the pathway from exposure [18].
For MR, we implemented the fixed-effects inverse variance weighted (IVW) method as the main approach to examine the overall causal relationship between exposure and ALS based on the effect of SNPs on LTL and the effect of SNPs on ALS [19]. To validate the results from the IVW method, we applied the weighted median method, simple median method [20], MR Egger method [21] and MR-PRESSO method as sensitivity analyses. To test potential pleiotropy, the MR Egger method, which is capable of reminding the presence of pleiotropy when the intercept significantly deviates from the origin, and MR-PRESSO analysis, which was used to detect the influence of outliers [22], were employed. The heterogeneity of SNPs used in IVW estimates was tested by Cochran's Q test, which suggests the presence of heterogeneity when it is lower than the significant P value. Leave-one-out analysis and single SNP analysis were employed to evaluate the robustness of the significant results and the possibility of results being driven by a single SNP. We also calculated F statistics for IVs to demonstrate their strength. We performed the MR Steiger method to explore the potential reverse causal impact of ALS on the exposure [23]. We adopted a publicly available online tool to calculate the statistical power of our analysis (https://shiny.cnsgenomics.com/mRnd/). All the analyses were performed in R software version 3.6.3 [24]. The valid positive P value was less than 0.025 (0.05/2) after Bonferroni correction.