2.1 Patients
This retrospective study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of Aerospace Center Hospital, with the requirement for informed consent waived. During the period of June 2015 to November 2017, 138 patients with acute appendicitis who had underwent surgery and had clinical and laboratory data available, including inflammatory response and pathological diagnosis, were initially included.
The inclusion criteria were as follows: 1) patients had histologically confirmed acute appendicitis including UA and CA; 2) patients had data records including clinical manifestations (RIPASA scoring), laboratory tests before surgery (cell counts in full blood and inflammatory factors), which were listed in Table 1 and Table 2; 3) patients agreed to participate in the study and provided signed informed consent; 4) the preoperational examination indicated no surgical contraindications; and 4) the age range was 18 to 81 years old. The exclusion criteria for the current study were as follows: 1) patients did not meet the inclusion criteria; 2) patients with ileocecal neoplasms. Finally, 2 patients with mucinous adenocarcinoma were excluded, and 136 patients were enrolled in this study. Figure 1 depicts the patient selection process. The 136 patients (UA = 112; CA = 24) were divided at a ratio of ~ 7:3 into training set (n = 94, UA = 78, CA = 16) and validation cohort (n = 42, UA = 34, CA = 8) randomly.
Basic information, including age and sex, physical symptoms, clinicopathological data, and blood assay results before surgery, such as inflammation factors of high-sensitivity C-reactive protein (hs-CRP), procalcitonin (PCT), the lymphocyte subpopulations, were retrospectively extracted from electronic medical records.
2.2 Histopathology
All patients underwent surgical treatment, and all of the surgical specimens were examined by two pathologists. The pathological types of 136 cases of acute appendicitis were as follows: acute simple appendicitis (n = 9), acute purulent appendicitis (n = 103), acute gangrenous or perforated appendicitis (n = 24), and periappendiceal abscess (n = 0). The numbers of CA and UA cases were 24 and 112, respectively.
2.3 Feature selection and CA predictive modeling
In this study, the patients were divided into UA and CA groups according to histopathology. Univariate analysis was used to select the effective features among clinical and laboratory data, which have significant differences between UA and CA groups. CA predictive models based on individual clinical and laboratory data features and models combining clinical and laboratory data features were built separately. The CA prediction probability of individual clinical and laboratory features was identified by univariate logistic regression analysis. To construct predictive model based on combined features, the following three machine learning algorithms with high stability were investigated: logistic regression (LR), support vector machine (SVM) and random forest (RF). The models were trained and assessed using the repeated ten-fold cross-validation method in the training set, and differentiation performance was evaluated with the testing set.
2.4 Validation of the prediction model
Univariate logistic regression analysis was used to assess the clinical and laboratory features in predicting CA. The diagnostic ability of the single and combined models was studied with Receiver Operating Characteristic (ROC). The CA prediction performance was assessed using the area under the curve (AUC) of ROC curve, sensitivity, specificity and accuracy (ACC). In addition, a nomogram was plotted to better express the predictive effect of logistic regression model. The statistical difference of AUC among the three machine learning models was analyzed. Decision curve analysis (DCA) was conducted to evaluate the clinical usefulness of best preoperative prediction model by quantifying the net benefits at different threshold probabilities in the testing set (31).
2.5 Statistical analysis
Comparisons of proportions and ranks of variables between training and testing set, and between UA and CA groups were performed using the Chi-square test, Fisher’s exact test, Kruskal-Wallis H-test, Student's t-test or Mann-Whitney U test, as appropriate. The clinical and laboratory characteristics were compared using chi square test or Fisher’s exact test for the nominal variable, Kruskal-Wallis H-test for the ordinal variable and Mann-Whitney U test for the continuous variable with abnormal distribution. Univariate logistic regression analysis was used to present prediction performance of individual clinical or laboratory feature. In addition, ROC curve analyses were performed to determine the AUC, ACC, sensitivity and specificity for each predictive model. The statistical difference of AUC between any two of the machine learning models was analyzed by Delong’s test. DCA describe the clinical benefit of the predictive model as the difference between the true-positive and false-positive rates, weighted by the odds of the selected threshold probability of risk.
Statistical analysis was conducted with R software (Version: 3.6.0, https: www.r-project.org). The reported statistical significance levels were all two-sided, and the statistical significance was set at 0.05. The multivariate logistic regression and ROC analysis were performed with the ‘stats’, ‘glmnet’ and ‘pROC’ packages. The construction of the DCA and nomogram diagrams were performed using the ‘rms’ and ‘rmda’ packages.