Selection of studies
We searched MEDLINE using PubMed, Google Scholar and Science Direct for association studies as of August 17, 2019. The terms used were “Lysyl oxidase”, “LOX”, “polymorphism” and “cancer” as medical subject heading and text. References cited in the retrieved articles were also screened manually to identify additional eligible studies. Inclusion criteria were (1) case–control studies evaluating the association between LOX polymorphisms and cancer risk; (2) sufficient genotype frequency data presented to calculate the odds ratios (ORs) and 95% confidence intervals (CIs). Exclusion criteria were: (1) reviews; (2) studies whose control frequencies deviated from the Hardy-Weinberg Equilibrium (HWE); (3) not case-control; and (iv) unusable genotype data.
Data extraction
Two investigators (RM and NP) independently extracted data. A third investigator (PT) adjudicated disagreements until consensus was reached. The following information was obtained from each publication: first author’s name, published year, country of origin, ethnicity, cancer type, study design, studies that matched their controls with cases and the criteria used, sample sizes and genotype frequencies.
Methodological quality of the studies
We used the Clark-Baudouin (CB) scale to evaluate methodological quality of the included studies [8]. The CB criteria include P-values, statistical power, and corrections for multiplicity, comparative sample sizes between cases and controls, genotyping methods and the HWE. In this scale, low, moderate and high have scores of < 5, 5–6 and ≥ 7, respectively.
Data distribution and power calculations
Data distribution was assessed with the Shapiro-Wilks (SW) test using SPSS 20.0 (IBM Corp., Armonk, NY, USA). Gaussian (normal) distribution (P > 0.05) warranted descriptive expressions of mean ± standard deviation (SD). Otherwise, the median (with interquartile range) was used. Using the G*Power program [9], we evaluated statistical power as its adequacy bolsters the level of associative evidence. Assuming an OR of 1.5 at a genotypic risk level of α = 0.05 (two-sided), power was considered adequate at ≥ 80%.
HWE
Using the application in https://ihg.gsf.de/cgi-bin/hw/hwa1.pl, we assessed the HWE and reported the P-value of the controls from the Pearson's goodness-of-fit χ2-square test. A P-value of < 0.05 indicated deviation from the HWE. Deviations were found in six studies [10–15] thus were excluded from the analysis (Table S1). Table S2 accommodates a column that details the non-significance (P > 0.05) of the HWE-compliant studies.
Data synthesis
Cancer risks (ORs and 95% CIs) were estimated for each study using the following genetic models: (i) homozygous [H] (ii) recessive [R] (iii) dominant [D], and (iv) codominant [C]. Comparing the effects on the same baseline, we calculated pooled ORs and 95% CIs. In addition to the overall analysis, we also examined two subgroups, Asians (3,834 cases/4,061 controls) and cancer type. The latter was stratified into digestive (1,453 cases/1,546 controls) and breast (935 cases/923 controls). Strength of evidence was assessed using three indicators: First, the magnitude of effects are higher or lower when the pooled ORs are farther from or closer to the OR value of 1.0 (null effect), respectively [16]. Second, the P-value is interpreted in terms of the Bayes Factor (BF), which is supported by evidence from both null and alternate hypotheses. In contrast, the P-value, by itself, addresses the null hypothesis only [17]. Thus, P-values of 0.05 and 0.001 correspond to the minimum BFs of ≥ 0.15 and 0.005, indicating moderate and strong (to very strong) evidence, respectively [18]. The BF rests on the likelihood paradigm [19], where strength of the hypotheses rests on the data [17]. Thus, the likelihoods between the absence (null hypothesis) and presence (alternate hypothesis) of association of LOX with cancer are compared. Third, homogeneity is preferred to heterogeneity, but heterogeneity is unavoidable [20]. The reason for this preference is that conclusions made in the milieu of homogeneity have greater evidential strength than those that are heterogeneously derived. Thus, presence of heterogeneity between studies was estimated with the χ2-based Q test [21], with threshold of significance set at Pb < 0.10. Heterogeneity was quantified with the I2 statistic which measures variability between studies [22]. I2 values of > 50% indicate more variability than those ≤ 50% with 0% indicating zero heterogeneity (homogeneity). Evidence of functional similarities in population features of the studies warranted using the fixed-effects (Fe) model [23], otherwise the random-effects (Re) model [24] was used. Sensitivity analysis, which involves omitting one study at a time and recalculating the pooled OR, was used to test for robustness of the summary effects. We did not assess publication bias because none of the comparisons had ≥ 10 studies. Less than this number presents low sensitivity of the publication bias tests [25]. Except for heterogeneity estimation [21], two-sided P-values of ≤ 0.05 were considered significant. All associative outcomes were Bonferroni-corrected. Data for the meta-analysis were analyzed using Review Manager 5.3 (Cochrane Collaboration, Oxford, England), SIGMASTAT 2.03, and SIGMAPLOT 11.0 (Systat Software, San Jose, CA).