Patients and procedures
This retrospective hospital-based case-control study was conducted to investigate the association between breast volume and the risk of breast cancer among women; it involved 208 cases and 340 controls from March 2018 to May 2019 from two hospitals in Guangdong Province, one of which was a tertiary hospital with nearly 3,000 beds. Eligible participants were asked to complete a questionnaire under the supervision of trained interviewers; therefore, the loss of sample size due to lost interviews was not considered. The case group included women diagnosed with breast cancer through pathological examination, while the control group included women who underwent health screening examinations, including molybdenum target X-ray breast examinations, and women who had breast cancer within one month were excluded. The case group and control group were matched in a ratio of 1:1. To achieve a power of 85% and a two-tailed type I error rate of α = 0.05, each group required at least 181 patients. Considering the propensity score matching success rate, we collected more than this number of cases and more controls to obtain optimal match. The case group included adult women who had recently been diagnosed with breast cancer and were preparing for surgery or adjuvant chemotherapy, excluding those who were pregnant or breast-feeding or who had a history of breast cancer surgery, breast masses, breast augmentation, or communication difficulties. Healthy adult women who were examined at the hospital to ensure that they did not have breast cancer were enrolled in the control group, excluding those who were pregnant or breast-feeding or who had undergone breast augmentation, suspected breast cancer, or communication difficulties.
Information on sociodemographic characteristics, menarche age, alcohol consumption, smoking, history of proliferative benign breast disease, feeding mode, oral contraceptives, reproductive history, and family history of breast cancer was obtained through a structured questionnaire. The breast parameters of the two groups were measured and breast volume was calculated using a formula based on linear measurements of breast parameters[24].
Exposure and covariate determination
The dependent variable for this study was cancer status (as a dichotomous variable with 1: breast cancer diagnosis; 0: nonbreast cancer diagnosis), and the only exposure factor was breast volume, which was calculated from linear measurements using the BREAST-V formula[24]. Breast volume was a continuous variable, and other variables, such as age, BMI, menarche age, age at first pregnancy, number of pregnancies, feeding mode, history of proliferative benign breast disease, history of oral contraceptives, smoking, alcohol consumption, history of hyperthyroidism, and family history of breast cancer were assessed at the baseline with the questionnaire.
Collection of linear breast measurement data collection
All data in this study were collected during in-person interviews after consent was obtained from all study participants; bilateral breast data were collected. The measurer was single-blind to grouping. Anatomical distances included in the BREAST-V formula were the sternal notch-to-nipple distance, fold-to-nipple distance, and fold-to-fold projection distance when the measured person was in a standing position.
Statistical analysis
This study was a case-control study. All data were entered by two people after verification and statistical processing was performed using SPSS 24.0. The normally distributed data were described as M±S, and the independent sample t test was used for comparisons between two groups; the data with a skewed distribution were described by M (P25, P75), and the Mann-Whitney U test was used for comparisons between two groups. The count data were described by a ratio or composition ratio, and the chi-square test was used for comparisons between two groups. A level of P < 0.05 was used to indicate significance; all statistical tests were two-tailed.
Propensity score matching
Propensity score matching was performed to control for potential confounders, and the match tolerance value was 0.005. The propensity scores were determined by using age, BMI, age at menarche, age at first pregnancy, number of pregnancies, feeding mode, history of proliferative benign breast disease, history of oral contraceptives, smoking, alcohol consumption, history of hyperthyroidism, and family history of breast cancer. The propensity value calculated according to the logistic regression was matched according to the 1:1 nearest neighbor matching method, and then the two matched groups were regarded as independent groups. The baseline data were statistically analyzed before and after matching. Binary logistic regression analysis was used before PSM, while conditional logistic regression analysis was performed with the help of a Cox regression model in SPSS 24.0 to evaluate the effect of breast size on the risk of breast cancer after matching. A virtual survival time was recorded for each row before and after matching. Survival time was regarded as a time variable, outcome was regarded as a status variable, and the remaining variables were regarded as covariates. The default "case group" had a short survival time, and the "control group" had a long survival time. The odds ratio (OR) for breast cancer was calculated in the highest vs lowest quartile of breast volume as the ratio between the observed prevalences, and it was expressed with a 95% confidence interval.
Ethics statement
Written informed consent was obtained from all study participants and ethical approval was granted by the ethics committee of Nanfang Hospital.