In the field of oncology, the number of publications utilizing Mendelian Randomization (MR) methods has been steadily increasing over the years. The earliest English literature was published in 2003, and a new rapid development phase began in 2019. According to literature searches, an article titled "Reading Mendelian randomisation studies: a guide, glossary, and checklist for clinicians" was published in the British Medical Journal (BMJ) in 2018. This article focuses on helping clinicians and practitioners understand and interpret the core concepts and latest advances in MR methods, as well as guiding reporting standards and result interpretation [10]. The emergence of this authoritative and normative guidance literature may have contributed to the rapid growth of English publications and the initiation of Chinese research. In recent years, the large number of MR studies published indicates that more and more scholars are mastering this research method and applying it in causal inference in the field of oncology. This could potentially lead to groundbreaking research, providing theoretical foundations for tracing the causes of cancer and its prevention and treatment.
Statistics on the authors and institutions of the publications reveal that the collaboration network formed by Gunter, Marc J, Le Marchand, Loic, Chang-Claude, Jenny, and Hopper, John L is the most prominent. The high-frequency authors and their affiliated institutions are mostly from the UK, Germany, the USA, and France. The top 10 authors in terms of publication volume have each published more than 25 articles, and these authors use MR research methods to explore the co-occurrence in cancer etiology studies. In the summary of the publishing institutions, the betweenness centrality of each node is relatively low, indicating that there is more collaboration within institutions but less collaboration between institutions. Future research could further enhance inter-institutional collaboration to promote more cross-institutional research outcomes.
As for the discussion on keyword analysis, we can draw the following conclusions:
(1) Summary of the Nature and Methods of MR Research
High-frequency keywords such as "risk," "association," and "instruments" reflect the essence of MR research in exploring factors with potential causal relationships to diseases. This research method is based on data sources aggregated from "genome wide association" (GWAS) studies [11].
(2) Analysis of Key Cancer Types
Keyword statistics and cluster analysis from the literature identified key cancer types, including breast cancer, colorectal cancer, lung cancer, prostate cancer, gastric cancer, and renal cell carcinoma. According to epidemiological statistics published by GLOBOCAN 2022, lung cancer was the most common cancer in 2022, with approximately 2.5 million new cases, accounting for 12.4% of all new cancer cases. This was followed by female breast cancer (11.6%), colorectal cancer (9.6%), prostate cancer (7.3%), and gastric cancer (4.9%). Additionally, lung cancer also ranked first in the number of male and female malignant tumor deaths, with colorectal cancer, breast cancer, and gastric cancer also ranking high in mortality rates [2]. Therefore, for cancers with high incidence and mortality rates, inferring their possible causes is even more urgent.
According to the keyword emergence map and cluster timeline map from English literature, key cancer types such as breast cancer and colorectal cancer appeared in the early years. As time progressed, recent MR research has gradually expanded to cancer types with relatively smaller affected populations in epidemiological statistics, such as ovarian cancer, endometrial cancer, and renal cell carcinoma.
(3) Discussion on Risk Factors
Researchers systematically summarized and analyzed the relationships between previously reported risk factors and tumors through Mendelian randomization studies, especially large-scale GWAS studies, providing evidence for modifying the risk factor profiles of various cancer types. For example, a study on modifiable cancer risk factors for colorectal cancer summarized 39 potential modifiable cancer risk factors related to diet and lifestyle, fatty acid characteristics and metabolism, obesity, and inflammatory factors. By using genetic variants as proxies for presumed risk factors, the study provided indicative evidence for the associations between increased body fat percentage, BMI, waist circumference, basal metabolic rate, adult height, serum vitamin B12 concentration, serum iron concentration, low-density lipoprotein cholesterol, and total cholesterol with increased colorectal cancer risk[12].An MR study on breast cancer risk evaluated the potential causal relationships of 23 known and suspected risk factors and biomarkers with overall breast cancer risk and molecular subtypes. It identified 15 significantly associated factors, including age at menarche, age at menopause, body mass index, waist-to-hip ratio, height, physical activity, smoking, sleep duration, morning preference, and six blood biomarkers [13].
Horizontally, the exploration of risk factors includes macro indicators such as diet and lifestyle, BMI, micro indicators such as sex hormones in hematology, symptoms like back pain, other diseases, and even socioeconomic status [12, 14–19]. According to the keyword and cluster analysis results involving cancer types, sex hormones and hormone-sensitive cancers such as breast cancer, prostate cancer, ovarian cancer, and endometrial cancer are key research areas. Additionally, epidemiological data highlight significant sex differences in the incidence rates of various cancers in non-reproductive organs, but studies on the role of sex hormones are relatively scarce. In the era of personalized medicine, understanding the molecular, cellular, and biological differences between males and females is necessary for developing more appropriate treatment interventions based on gender [20].
Vertically, more specific subgroup MR studies have also been conducted. For example, in terms of cancer molecular subtypes, MR research on breast cancer has found inconsistent results regarding the age of menopause and circulating sex hormone-binding globulin (SHBG) in triple-negative breast cancer compared to luminal A and B breast cancer, suggesting that the etiology of triple-negative breast cancer may differ from other breast cancer subtypes [13]. Regarding risk factors, for example, BMI has been explored not only in terms of its impact on specific cancers [21], but also in combination with different pathological subtypes [22], fat distribution [23], gender and age [24, 25], and race [26–28].
(4) Combined Analysis with Other Research Methods
MR research also has limitations. The instrumental variable (IV) assumption implies no need for confounding adjustment, but in reality, although there is a strong association between genetic variants and diseases, this does not necessarily imply a causal relationship. The combined application of MR and Meta-analysis can be quite effective in improving the aggregated information from GWAS, enhancing causal inference capabilities, and providing more reliable evidence [21, 29, 30].