Multivariate machine learning models for prediction of postoperative intestinal obstruction in patients underwent laparoscopic colorectal surgery: A retrospective observational study

Machine learning may predict postoperative intestinal obstruction (POI) in patients underwent laparoscopic colorectal surgery for malignant lesions. We used five machine learning algorithms (Logistic regression, Decision Tree, Forest, Gradient Boosting and gbm), analyzed by 28 explanatory variables, to predict POI. The total samples were randomly divided into training and testing groups, with a ratio of 8:2. The model was evaluated by the area operation characteristic curve (AUC), F1-Measure, accuracy, recall, and MSE under the receiver. A total of 637 patients were enrolled in this study, 122 (19.15%) of them had POI. Gradient Boosting and gbm had the most accurate in training group and testing group respectively.The f1_score of Gradient Boosting was the highest in the training group (f1_score =0.710526), and the f1_score of gbm was the highest in the testing group (f1_score =0.500000). In addition, the results of the importance matrix of Gbdt algorithm model showed that the important variables that account for the weight of intestinal obstruction after the first five operations are time to pass flatus or passage of stool, cumulative dose of rescue opioids used in postoperative days 3 (POD 3), duration of surgery, height and weight.

randomized clinical trials showed that the incidence of intestinal obstruction after colorectal cancer surgery was about 2.28-5.48% [1,2]. POI may increase hospital stays and costs, which may result in serious complications [3]. For example, aspiration pneumonia may lead to 2.4-22% of early postoperative death [4].
Recently, laparoscopic methods have been shown to be associated with improved postoperative digestive recovery [5,6].
However, part of patients underwent laparoscopic surgery still have a dynamic intestinal obstruction. In recent years, with the increase of environmental risk factors and the improvement of diagnostic techniques, the incidence of many diseases shows an ascending trend [7,8]. "Machine learning" or "artificial intelligence" is the hotspot in predictive research. Shanmuga et al. used a deep learning method to analyze colorectal polyps in images, the results showed that tumor detection accuracy of the method using in colon images is up to 95% through the existing algorithm evaluation [9]. At the same time, Xu et al. applied the machine learning methods like support vector machine, which can effectively classify colon cancer patients with different prognosis [10]. In this study, we explored the application of machine learning methods to improve the prediction and classification of postoperative intestinal obstruction in patients underwent laparoscopic colorectal cancer surgery.

Methods
To investigate the predict effect of machine learning on POI underwent laparoscopic colorectal cancer surgery, we conducted a retrospective observational study at Nanjing Medical University Affiliated Suzhou Hospital.

Participants
We performed a retrospective analysis of consecutive patients aged 18 years or older who underwent laparoscopic colorectal surgery for malignant lesions from April 2016 to January 2017. Exclusion criteria were patients who underwent surgery other than laparoscopic colorectal surgery, converted to open surgery, robotassisted laparoscopic colorectal surgery, and parenteral nutrition surgery. POI was defined as flatulence and/or fecal pass delay or oral intake intolerance on the third day after surgery and confirmed with radiographs that small and/or large intestinal dilatation on abdominal X-ray films.

Anesthesia and operation management
The surgeries were performed by six different surgeons, each with more than 200 experiences in laparoscopic colorectal surgery. Laparoscopic surgery includes single incision and conventional laparoscopic colorectal surgery. Anesthesia techniques were similar in all cases. There was no thoracic epidural analgesia. Intravenous midazolam sufentanil, propofol, and rocuronium were applied for induction of anesthesia, providing neuromuscular blockage for endotracheal intubation.
Anesthesia was maintained with propofol, remifentanil, and sevoflurane. Opioids were routinely administered for postoperative pain 30 minutes prior to the end of surgery.

Variables collection
Data on patient demographics, social habits, comorbidities, intraoperative data (duration of surgery and anesthesia, type of surgery and anesthesia, quantity of intravenous infusion, estimated blood loss) and postoperative analgesia were collected (Maximum pain score [NRS] and cumulative dose of opioid used on the third day after surgery). All opioid administrations were converted to equivalent doses of intravenous morphine. The age-adjusted Charlson Comorbidity Index was used to assess comorbidity. The original Charlson Comorbidity Index was coined in 1987 and it was calculated by summing the weighted scores of 19 medical conditions. Since age was determined to be an important factor in overall survival, the patient's age subsequently acted as a correction variable in the final Charlson index score. It is reported that this modification of the Charlson Comorbidity Index called age adjustment has better predictive effect on hospital mortality and adverse events than other versions of the Charlson Comorbidity Index. Events such as postoperative wound dehiscence were also recorded.

Machine learning algorithm
Logistic regression is one of the most commonly used and most classical classification methods in machine learning. Although it is called a regression model, it deals with classification problems. This is mainly because its essence is a linear model plus a mapping function sigmoid, which maps the continuous results obtained by the linear model to discrete models.
Decision tree learning is a method of approaching the objective function of discrete value in which the learned function is represented as a decision tree. A decision tree classifies an instance by arranging instances from a root node to a leaf node.
The leaf node is the class to which the instance belongs. Each node on the tree specifies a test for an attribute of the instance, and each subsequent branch of the node corresponds to a possible value for the attribute. The classification to an instance starts from the root of the tree, testing the properties of the node, and then move down the branches corresponding to the property values of the given instance. This process is then repeated on the subtree of the new root node.
Random forest, as the name implies, establishes a forest randomly. There are many decision trees in the forest, and there is no correlation between each decision tree in the random forest. After the forest is gotten, when a new input sample enters, each decision tree in the forest makes a separate judgment to classify the sample (for the classification algorithm), and the sample is predicted to be the classification that has been chosen for the most times.
Gbdt is an iterative decision tree algorithm consisting of multiple decision trees.
The conclusions of all trees are added together to make the final answer.
Lightgbm (gbm) is another implementation method of Gbdt, which adopts two new strategies based on Gbdt. Gradient-based One-Side Sampling (GOSS) is that, although Gbdt has no data weight, each data instance has different gradients.
According to the definition of computing information gain, the instance with larger gradient has greater influence on the information gain. Thus, samples with large gradients should be kept (pre-set thresholds or highest percentiles) and samples with small gradients should be randomly removed during downsampling whenever possible. Exclusive Feature Bundling (EFB) means that many features are almost mutually exclusive especially in sparse feature spaces, and we can bundle mutually exclusive features. Finally, we reduce the bundling problem to the graph coloring problem and obtain an approximate solution through the greedy algorithm.

Statistical analysis
Python programming language (Python Software Foundation, version 3.6) were used for our analysis. The following packages for machine learning were used: Scikitlearn (https://github.com/scikit-learn/scikit-learn) and

Results
A total of 637 patients were included in the study, and 122(19.15%) cases had POI. The importance matrix of the Gbdt algorithm model is shown in Fig. 1 The comparation of models constructed by the five machine learning algorithms in the training set were shown in Table 1 and Fig. 1 The comparation of models constructed by the five machine learning algorithms in the test set were shown in Table 1 Table 3 and Fig. 4)  Table 4.

Discussion
Although the promotion of laparoscopic technique has significantly reduced the  GBDT integrates several models with common performance (usually deeply fixed decision tree) into a model with better performance. It has the natural processing ability for mixed data and has strong predictive ability. LightGBM is a tree-based gradient boosting framework that supports efficient parallel training. The results of this study also showed that the GradientBoosting and gbm performed better than the other two algorithms.
POI is still a common complication after colorectal surgery and its pathophysiology remains unclear [12]. A cohort study including nearly 28,000 patients reported a POI incidence of 12.7% after colon surgery. The incidence of postoperative intestinal obstruction was higher in obese patients [13]. In a small series of reports, according to Parker et al postoperative intestinal obstruction patients are basically obese [14].
The reason may be that the intestines of obese patients contain more mesenteric fat, which makes it more difficult to operate, and may cause damage to the intestine during surgery. This is consistent with our findings.
General anesthesia, especially the application of opioids, has been proved to be able to affect bowel movements [15]. Opioids such as morphine can bind to the gastrointestinal µ receptor and inhibit gastrointestinal motility. The study of Boelens et al has shown that the incidence of POI in patients administered with opioids after colorectal cancer surgery was significantly higher than that of patients who didn't use opioids after operation [16]. In recent years, ideal therapeutic effect had been obtained by pre-operatively administering to the patient with µ receptor agonist alvimopan for the prevention and treatment of POI [17]. In addition, Pillai et al. conducted a randomized controlled study of perioperative fluid management and esophageal ultrasound Doppler detection, and the results showed that optimizing perioperative fluid intake was correlated with enhancing rapidly restored intestinal function and reducing the incidence of other complications [18]. Our results also suggested that the application of opioids and fluid infusion may play an important role in the occurrence and development of POI.
In addition, a recent retrospective study involving 11,397 patients who underwent open or laparoscopic colectomy showed that the high age-adjusted Charlson comorbidity index score was an independent predictor of POI prolongation [19].
Long operating time was a risk factor for POI in colorectal surgery [20]. Long operative time may indicate prolongation, technical difficulties, and/or increased inflammatory response, any of which can directly induce the occurrence of POI [21,22]. This is similar to our results.
This study has several potential limitations. A major limitation is the use of retrospective methods for data collection. Moreover, in addition to the dose of opioid used in the first 3 days after surgery, there may be doses at other times that contribute to the POI. Moreover, although we can quantify the weight of each variable on the postoperative intestinal obstruction through machine learning methods, there are many variables that cannot be intervened. We can prevent and pay attention to these variables according to their risk of causing postoperative intestinal obstruction. In addition, this study performed only internal verification and no external verification. Therefore, we need to collect more perioperative data based on etiology and explore a more efficient predictive model in further study.    Machine learning algorithm for prediction of postoperative ileus in the testing group