Background: Involvement of lymph nodes has been an integral part of breast cancer prognosis and survival. This study aimed to explore factors influencing on the number of auxiliary lymph nodes in women diagnosed with primary breast cancer by choosing an efficient model to assess excess of zeros and over-dispersion presented in the study population.
Methods: The study is based on a retrospective analysis of hospital records among 5,196 female breast cancer patients in Pakistan. Zero-inflated Poisson and zero-inflated negative binomial modeling techniques are used to assess the association between under-study factors and the number of involved lymph nodes in breast cancer patients.
Results: The most common breast cancer was invasive ductal carcinoma (54.5%). Patients median age was 48 years, from which women aged 46 years and above are the majority of the study population (64.8%). Examination of tumors revealed that over 2,662 (51.2%) women were ER-positive, 2,652 (51.0%) PR-positive, and 2,754 (53.0%) were Her2.neu-positive. The mean tumor size was 3.06 cm and histological grade 1 (n=2021, 38.9%) was most common in this sample.
The model performance was best in the zero-inflated negative binomial model. Findings indicate that most factors related to breast cancer have a significant impact on the number of involved lymph nodes. Age is not contributed to lymph node status. Women having a larger tumor size suffered from greater number of involved lymph nodes. Tumor grades 11 and 111 contributed to higher numbers of positive lymph node.
Conclusions: Zero-inflated models have successfully demonstrated the advantage of fitting count nodal data when both “at-harm” (lymph node involvement) and “not-at-harm” (no lymph node involvement) groups are important in predicting disease on set and disease progression. Our analysis showed that ZINB is the best model for predicting and describing the number of involved nodes in primary breast cancer, when overdispersion arises due to a large number of patients with no lymph node involvement. This is important for accurate prediction both for therapy and prognosis of breast cancer patients.