In this study, we developed and evaluated three machine learning prediction models to accurately predict the risk of endometrial lesions in premenopausal breast cancer patients undergoing TAM therapy. The best predictive performance was achieved by LASSO regression combined with logistic regression, which demonstrated an accuracy of 0.853 and precision of 0.917 using four easily accessible patient features. This model had high diagnostic performance, with an AUC of 0.891 (95% CI: 0.777-1.000). The findings confirm that ultrasonographic features, duration of TAM administration, endometrial thickness, and colporrhagia symptoms are clear predictors of endometrial lesions.
A national retrospective study of 102 breast cancer patients treated with TAM postoperatively found that the duration of TAM use and symptoms of abnormal colporrhagia were significant risk factors for developing endometrial lesions, consistent with our study. Additionally, a large body of epidemiologic evidence suggests that TAM is associated with an increased risk of endometrial lesions, with the risk of developing endometrial carcinoma (EC) being 1.5–6.9 times higher in a dose- and time-dependent manner 10. The ATLAS study found that patients using TAM for 10 years had a higher cumulative risk of endometrial cancer compared to those using it for 5 years4. However, only 10% of the patients in the ATLAS study were premenopausal, which may limit the generalizability of these findings.
Our study showed that the duration of TAM was an independent risk factor for developing endometrial lesions, aligning with previous studies11. However, the cumulative dose of the drug was not clarified. Choi et al. demonstrated that benign endometrial disease incidence was highest in subjects under 40 years of age treated with TAM, significantly increasing the risk of endometrial cancer12. Similarly, Liu et al. found that the standardized incidence of endometrial cancer was elevated in breast cancer patients diagnosed after the age of 4013. Younger patients treated with TAM have a higher risk of subsequent endometrial cancer, particularly those aged 40-4914. Bergman’s study also indicated that endometrial cancers caused by TAM were more malignant and aggressive11. Some studies have shown no correlation between TAM and endometrial lesions. For instance, Takashima15 found no significant association between shorter TAM therapy duration and endometrial lesions. Chiofalo and Chu also reported no correlation between TAM and endometrial cancer development 141617.
In our study, ultrasound characteristics were the most important factor in predicting endometrial lesions, consistent with previous studies. Ultrasound is the preferred monitoring tool, and abnormal occupancy or heterogeneous endometrial echogenicity on ultrasound increases the likelihood of developing endometrial lesions and necessitating endometrial biopsy. Previous NSABP studies, which included mainly postmenopausal women, suggested no additional monitoring for asymptomatic women to avoid unnecessary invasive procedures. However, this may underestimate the risk in premenopausal patients 1819. Young breast cancer patients undergoing prolonged TAM therapy require more attention. Endometrial screening and evaluation should be performed before TAM treatment, with regular transvaginal ultrasound monitoring to detect and manage endometrial lesions early.
Endometrial thickness was also a significant factor in endometrial lesion occurrence, with the optimal diagnostic threshold being 0.825 cm, similar to previous findings by Zhouqi and Burkart 220. Since TAM stimulates endometrial gland hypertrophy, leading to pharmacological thickening, establishing a TAM-related endometrial thickness threshold in young breast cancer patients is challenging.
Colporrhagia was identified as an important risk factor. Patients with colporrhagia are more likely to develop endometrial lesions, and this symptom serves as a warning for early hospital visits, improving detection rates. However, Maria et al. found no difference in abnormal colporrhagia between the case group and patients with normal endometrium, highlighting the need for further research 21.
This study demonstrates that machine learning approaches can achieve high accuracy in predicting endometrial lesions. Most current clinical prediction models rely on linear relationships between variables, often resulting in poor predictive ability. Machine learning applications in medicine are becoming widespread, offering effective tools for clinical diagnosis and prediction. Our study visualized and predicted endometrial lesions incidence using machine learning, providing valuable insights for gynecologists evaluating premenopausal breast cancer patients during endocrine therapy. The LASSO regression combined with multifactorial logistic regression prevented overfitting, with validation results showing an average absolute error of 0.014 between predicted and actual values, suggesting clinical diagnostic significance for endometrial lesion prognosis.
However, this study has limitations. It was a single-center retrospective study with limited data collection and a small sample size, which could introduce selection and recall biases. More objective indicators and larger sample sizes are needed to clarify endometrial thickness criteria. The model also lacks external validation. Future studies should incorporate comprehensive factors and utilize joint neural network prediction models to provide a basis for individualized endocrine therapy treatment in premenopausal breast cancer patients.
In conclusion, our study identified ultrasound characteristics, TAM duration, colporrhagia, and endometrial thickness as independent risk factors for endometrial lesions in premenopausal breast cancer patients. We developed a predictive risk model using machine learning, with LASSO regression combined with multifactorial logistic regression showing the best performance. Regular monitoring of these factors can aid in early detection and reduction of endometrial lesions, providing a basis for evaluating endocrine therapy, endometrial monitoring during treatment, and individualized therapeutic strategies for breast cancer patients.