*STUDY DESIGN AND DATA SOURCES:*

Databases, preferably big data analytics like multi-institutional clinical registries 36,37,38 have been used for developing the predictive algorithms. The de-identified data from large national database usually get exemption from review by institutional review board.

*GUIDELINES:*

The Transparent Reporting of Multivariable Prediction Models for Individual Prognosis or Diagnosis (TRIPOD) and JMIR Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research 39,40 are followed.

*SPLIT-SAMPLE APPROACH: TRAINING VERSUS VALIDATION SET:*

Based on the split proportions in the previous literature 41, the data is randomly divided into a training dataset (70%) and a validation dataset (30%). A set of predictive models for various types of postoperative outcomes like AEs, extended length of hospital stay etc. for the training dataset is made using, (i) generalized linear regression model with a logit link function (logistic regression model) , and (ii) least absolute shrinkage and selection operator (LASSO) regularization method. The performance of the prediction models developed using the training dataset is evaluated through the validation dataset.

*PATIENT SELECTION AND COHORT FEATURES:*

The variables for each eligible patient as a potential predictor of postoperative outcome following brain and spine surgery is extracted from the included dataset. The missing data could be imputed using multiple imputation with chained equations from large administrative databases 32.

*APPROACHES TO BUILD THE PREDICTIVE MODEL:*

The first approach in building the predictive model is the logistic regression. Based upon the Akaike Information Criterion 41, a forward and backward stepwise selection procedure is conducted. A natural cubic spline method is used to determine the non-linearity of the continuous variables 42. The second approach is based on the penalized regression model to obtain shrinkage estimators for the regression coefficients using LASSO method 43. Using LASSO, there is shrinkage of regression coefficients for some variables to zero since it uses a regularization method and shrinkage estimator to impose a constraint on the model parameters. Furthermore, a 10-fold cross validation is used to find a tuning parameter for each predictive model 34. An absolute value of the z-statistic for each model is used to evaluate the importance of included variable.

*PERFORMANCE EVALUATION OF THE PREDICTIVE MODEL:*

The discrimination of the predictive model is assessed using the receiver operator characteristic (ROC) area under the curve (AUC) on both the training and validation dataset. Furthermore, the calibration is assessed by plotting the observed incidence of each postoperative outcome against the incidence of the model-predicted probability. When the predicted effect for the model is equivalent, we expect the predictions to be closer to a 45⸰ diagonal line. Overall model performance is further assessed using the Brier score, which is the mean squared error between the predicted probability and the observed outcome of each model. The Brier score ranges between 0 and 1. A Brier score value of 0 shows a perfect fit.

Furthermore, a simulation study to evaluate the influence of sample size on the performance of prediction models for the assessment of postoperative outcomes is also important. Therefore, a random subset of data is selected from a varying sample size and repeated the model fitting procedure for calculating the predictive ability of overall complications using the logistic regression to calculate the AUC. Furthermore, decision curve analysis is performed to determine the best model for clinical management using net benefit over a range of probability thresholds.