Discordant American Society of Anesthesiologists Physical Status Scoring between Anesthesiologists and Surgeons Is Correlated With Adverse Patient Outcomes: A Retrospective Cohort Study of 46284 Elective Surgical Patients

The American Society of Anesthesiologists Physical Status Classication (ASA) score is used for communication of patient health status, risk scoring, benchmarking and nancial claims. Prior studies using hypothetical scenarios have shown poor concordance of ASA scoring among healthcare providers. However, there is a paucity of concordance studies using real-world data, as well as studies of clinical factors or patient outcomes associated with discordant scoring. The study aims to assess real-world ASA score concordance between surgeons and anesthesiologists, factors surrounding discordance and its impact on patient outcomes. This retrospective cohort study was conducted in a tertiary academic medical center on 46284 consecutive patients undergoing elective surgery between January 2017 and December 2019. ASA scores entered by surgeons and anesthesiologists, patient demographics, and postoperative outcomes were collected. We assessed the concordance of preoperative ASA scoring between surgeons and anesthesiologists, clinical factors associated with score discordance, the impact of score discordance on clinically important outcomes, and the discriminative ability of the two scores for 30-day mortality, 1-year mortality, and intensive care unit (ICU) admission. Statistical tests used included Cohen’s weighted 𝜅 score, chi-square test, t-test, unadjusted odds ratios and logistic regression models.


Background
The American Society of Anesthesiologists physical status classi cation (ASA) score is a widely utilized grading system rst introduced in 1941 to assess and communicate the preoperative health of patients undergoing anesthesia. [1] It was revised in 1961, [2] and modi ed in 2014 to include clinical examples for each ASA score with the aim of improving inter-rater reliability or concordance. The modern ASA score consists of six categories ranging from Class 1 (describing a healthy patient) to Class 6 (referring to the brain-dead organ donor). [3] ASA scoring has signi cance both clinically and from a health services perspective. While ASA scoring alone is not intended for the prediction of perioperative risks, [3] it has been shown to be independently predictive of perioperative morbidity and mortality [4] and is included as part of several perioperative risk assessment tools that are widely used by surgeons and anesthesiologists. These tools include the surgeons-authored National Surgical Quality Improvement Program risk calculator [5] and Gupta Myocardial Infarction or Cardiac Arrest calculator,[6] as well as the anesthesiologists-authored Surgical Assessment Risk Tool [7] and Combined Assessment of Risk Encountered in Surgery. [8,9] ASA scores are also frequently reported in healthcare benchmarking exercises and payer billing documentations.
Any signi cant discordance in ASA scoring between healthcare providers is therefore concerning and may subject patients to contradictory risk counseling and inappropriate perioperative plans. At a health system level, discordant ASA scoring may undermine efforts for quality assurance, [10] allocations of critical care resources, risk-based remuneration for health outcomes and may result in potential nancial costs from over-scoring. [11] Multiple studies have reported moderate to poor concordance of the ASA score among various clinicians. [12][13][14][15][16][17] In particular, one study observed a signi cant disagreement in ASA scoring between anesthesiologists and surgeons when presented with hypothetical patient scenarios and that surgeons consistently assigned lower scores.
[18] However, there is a paucity of real-world data on this question as most real-patient studies of ASA concordance to date have been conducted among anesthesiologists [12][13][14][15][16][17] or were restricted to speci c patient cohorts. [19,20] This is an important evidentiary gap as both specialties jointly manage patients undergoing surgeries. Furthermore, the association between discordant ASA scores and adverse patient outcomes has not been comprehensively studied previously.
To ll these knowledge gaps, our study aims to examine the concordance of ASA scoring between surgeons scheduling patients for surgery and anesthesiologists conducting the outpatient preoperative evaluation. We further examined the clinical and demographic factors associated with discordant scoring and whether discordant scores were associated with adverse postoperative outcomes. Finally, we compared the discriminative ability of ASA scores assigned by surgeons and anesthesiologists in the prediction of postoperative outcomes.

Study Design and Data Sources
This was a single-center retrospective cohort study conducted in Singapore General Hospital, the largest tertiary academic medical center in Singapore. The local Institutional Review Board (CIRB Reference number 2020/2801) granted a waiver of consent due to the use of anonymized routinely collected clinical data and no patient interaction was required. The data analysis and statistical plan was written and led with the Institutional Review Board before the data were accessed.
Our study cohort was extracted from the Perioperative and Anesthesia Subject Area, a curated electronic medical records database within our institution's enterprise data warehouse (SingHealth-IHiS Electronic Health Intelligence System) which contains the records of all operative procedures performed since 2015. The system integrates patient information such as patient demographics, laboratory results, comorbidities and postoperative outcomes from multiple healthcare transactional systems, such as the hospital's clinical information system (Sunrise Clinical Manager, Allscripts, Illinois, United States of America) and other administration and ancillary electronic systems. Mortality data on the system were synchronized with the National Electronic Health Records, including data from the National Registry of Births and Deaths, ensuring a near-complete mortality data follow-up.
In our institution, the ASA score is assigned by the surgeon on a standardized electronic admission form during the surgery listing process. Patients are then typically seen in the anesthesia preoperative clinic within a month of the surgery listing. Information on patient demographics, anthropometric parameters, preoperative comorbidities, and ASA score are routinely assessed by the attending anesthesiologist as part of structured clinical notes during the pre-operative assessment, and are included within the database. The 2014 ASA scoring de nition along with their full published examples are available for reference in the clinic. While the anesthesiologist can potentially access the surgeon's ASA score, it is usually independently assigned in our center. There are no nancial incentives in assigning higher ASA scores both for anesthesiologists and surgeons within our local healthcare system.

Participant Cohort and Variables
We included all patients aged 18 years old and above undergoing elective surgery under general or regional anesthesia or monitored anesthesia care between January 2017 to December 2019. Patients who underwent cardiac surgery, transplant surgery, or surgery for burns injuries were excluded. Patients planned for elective cardiac surgery in our center undergo preoperative anesthesia screening by the surgeon who lls in the pre-anesthesia assessment form before the patient is assessed by the anesthesiologist, while patients requiring transplant surgery would usually have a standardized ASA score as there is organ failure necessitating the surgery. Patients with a missing ASA score by either the surgeon or anesthesiologist and patients assigned an ASA score of 5 or 6 by either the surgeon or anesthesiologist were also excluded ( Fig. 1).
For each patient, we obtained preoperative data such as age, sex, race, surgical specialty, and comorbidities including ischemic heart disease, congestive heart failure, cerebrovascular accidents, diabetes mellitus requiring insulin, and hypertension. These comorbidities are assessed by the anesthesiologist as part of the Revised Cardiac Risk Index, which is routinely used in our institution. [21] The ASA scores assigned by both the anesthesiologist and surgeon were obtained, and the relevant clinical outcomes (death within 30 days, death within 1 year, ICU admission for > 24 hours) were determined.
Additional File 1(Supplemental Table 1) compares the characteristics of 264 patients who were excluded from our study as they had no valid ASA scores. All 264 patients had missing anesthesiologist ASA scores and there was no statistically signi cant difference between patients in the nal cohort and the excluded patients for demographic variables (age, sex, race) and clinical outcomes. Fewer of the excluded patients had anesthesiologist-assessed comorbidities, and the differences were statistically signi cant for some. Our interpretation is that patients with incomplete anesthesiologist ASA scores are more likely to have other areas incompletely assessed by the anesthesiologist. Overall, the number of such patients is small and not deemed to be a major source of bias.

Statistical Analysis
Analyses were performed using Python version 3.7.1 and R version 4.0.2 with their base utility functions. Additional packages used in R included the "questionr" package for multivariate logistic regression, "pROC" for receiver operating characteristic curve analyses, and "irr" for concordance analyses.

Assessment of Agreement between Surgeon and Anesthesiologist ASA Scores
Cross tabulation was performed for the anesthesiologist's ASA score against the surgeon's ASA score. Concordance between these two variables was determined using Cohen's weighted . The -statistic was interpreted in the manner of Landis and Koch as slight (0-0.2), fair (0.21-0.4), moderate (0.41-0.6), substantial (0.61-0.8) and almost perfect (0.81-1.0) agreement. [22] Our sample was drawn from a database that exhaustively documents all surgeries performed within the hospital, and we considered all sequential patients within the study time frame (January 2017 to December 2019). As a comparison, the sample size calculation to detect a moderate agreement ( > 0.4) and exclude a fair agreement ( = 0.2) with a one-sided 95% con dence interval and 90% power is 186.

Descriptive Statistics for Overall Cohort and Subgroup Analyses of Discordant ASA Scores
Descriptive statistics were calculated and expressed as counts and percentages for categorical data, and means with standard deviation for continuous data. The cohort was strati ed into patients with concordant and discordant ASA scores. Univariate statistical analysis was performed using the chi-square test for categorical variables and the t-test for continuous variables. Subgroup analyses were also performed comparing patients where the surgeon assigned a lower ASA score against patients with a concordant ASA score, and likewise comparing patients where the anesthesiologist assigned a lower ASA score against patients with a concordant ASA score. In view of the multiple statistical comparisons, Bonferroni's correction was used and the p-value cut-off for statistical signi cance was determined to be p < 0.001.

Effect of Discordant ASA Scores on Clinical Outcomes
The discordance of ASA scores between surgeons and anesthesiologists was calculated and strati ed in several different ways. Three forms to express discordance were used. Firstly, as a binary variable representing whether the ASA scores were discordant or not; secondly, as a ternary variable representing whether the ASA scores were concordant, surgeon ASA score was lower, or anesthesiologist ASA score was lower; and lastly, as the raw difference with appropriate binning of categories with low counts. These variables, representing different ways of stratifying the degree of ASA score discordance, were separately entered as the sole predictive variable into logistic regression models. A separate model was tted for each of the clinical outcomes of death within 30 days, death within 1 year, and ICU admission for > 24 hours. The unadjusted odds ratios and p-values were calculated for each stratum of ASA discordance, with the ASA concordant patients as the reference group.

Comparison of Surgeon and Anesthesiologist ASA Score Discriminative Ability
Logistic regression models using the ASA score as a sole predictor were tted for surgeon and anesthesiologist ASA scores for the outcomes of 30-day mortality, 1-year mortality, and ICU admission > 24 hours. The receiver operating characteristic curve, area under the receiver operating characteristic curve (AUC), and its 95% con dence interval were used to determine each model's discriminative ability. DeLong's method was used to compare for statistically signi cant differences between the receiver operating characteristic curves of models based on anesthesiologist ASA scores versus those based on surgeon ASA scores. [23] 3. Results

Concordance of Surgeon and Anesthesiologist ASA scores
Our nal study cohort comprised 46284 patients, of which 46.4% (21474/46284) were male and 53.6% (24810/46284) were female. The cross-tabulation of surgeon and anesthesiologist ASA scores for all cases is presented in Table 1. The weighted Cohen's for concordance between surgeon and anesthesiologist scores was 0.53, signifying moderate agreement. For all baseline patient characteristics, there were signi cant differences between patients with concordant and discordant scores, with exception of the male sex and the presence of raised creatinine. Discordant ASA scores overall were associated with a higher risk of all adverse outcomes-death at 30 days, death at 1 year, and ICU admission of more than 24 hours. When the discordant ASA scores were further strati ed, we observed that a lower surgeon ASA score was associated with all negative outcomes. The magnitude of risk was greater the lower the surgeon ASA score was compared to the anesthesiologist. On the other hand, a lower anesthesiologist ASA score was only associated with ICU admission > 24 hours but not death at 30 days or 1 year. This is depicted in Fig. 2, with additional details included in Additional le 2 (Supplemental Table 2).

General Discussion
Our results provide real-world evidence of differences in ASA scoring between surgeons and anesthesiologists after the 2014 ASA score modi cation, which have previously been studied only in hypothetical scenarios [18,24] or between anesthesiologists and Internal Medicine providers. [25] Furthermore, we found that discordant ASA scores are associated with adverse outcomes, particularly when the surgeonassigned ASA score is lower.
The observed moderate concordance ( 0.53) in our study is consistent with that reported in the retrospective cohort study by Sankar  The majority of the discordant scores were scored lower by surgeons, with the largest group comprising those assigned ASA 2 by the anesthesiologist but ASA 1 by the surgeon. We observed that patients with discordant ASA scores had a signi cantly higher proportion of comorbid clinical conditions (raised creatinine, diabetes mellitus on insulin, history of congestive heart failure, cerebrovascular accident, ischemic heart disease and smoking). This re ects the continuing subjectivity of the ASA scoring system despite the 2014 update, which was intended to improve concordance. The differences in recognition and perceived signi cance of comorbidities are likely to be a major contributing factor to the discordant ASA scores.
As the ASA score is a component of several major surgical risk scoring systems used by both surgeons and anesthesiologists in clinical care, discordant ASA scores can adversely impact the reliability of perioperative risk scoring and subsequent risk counseling. The ASA score is routinely used in deciding what pre-operative tests a patient requires at our institution and in other countries such as the United Kingdom.
[28] Overestimation of the ASA score would increase the number of investigations a patient has before surgery, incurring unnecessary nancial costs to the patient and healthcare system, while an underestimation of the ASA score may compromise patient safety. At the health systems level, discordant scores also can affect the allocation of critical care resources and undermine the use of the score in healthcare reimbursement and quality assurance efforts. This may disadvantage healthcare institutions nancially and in inter-institutional rankings depending on which score is being reported to the external agencies. Other studies have shown that the addition of examples to the ASA score and reinforcement of its use were required to improve reliability. [27,29] Standardization efforts are needed to improve the utility of ASA scores in clinical practice and for uses beyond the original intention of communicating patient healthcare status.
We also note that certain demographic factors were associated with discordant ASA scores, such as in younger patients and those of minority ethnicity. We postulate that younger patients may be perceived to have lower severity of disease by some clinicians, hence grading them with a lower score. Minority race patients may face communication or cultural barriers in disease and symptom communication and this may adversely affect accurate healthcare assessment. Ideally, demographic factors should not in uence ASA scoring, which should be an objective re ection of patient physical status. This nding further supports the need for better standardization and education on ASA scoring.
Our study revealed that patients with discordant ASA scores had poorer clinical outcomes. With respect to mortality, strati ed analyses of discordant ASA scores showed that patients whose surgeon assigned a lower score had a higher risk of 30-day and 1-year mortality. The lower the surgeon ASA score when compared to the anesthesiologist ASA score, the higher the risk was for 30-day and 1-year mortality. In contrast, patients with discordant ASA who were scored lower by their anesthesiologist did not have such an association. This is noteworthy, given that simple differences in medical opinion leading to discordant patient assessments would not ordinarily be expected to correlate with patient outcomes. Considering our analysis of ASA score discriminative ability, where anesthesiologists ASA scores had better discriminative ability for 30-day and 1-year mortality compared to those assigned by surgeons, this suggests that under-recognition of comorbidities by the surgeons might have resulted in an inaccurately optimistic ASA scoring in the discordant cases. Failure to recognize a high perioperative risk patient or interval development of comorbidity in the short timespan between surgeon and anesthesiologist review could have contributed to the poorer patient outcomes seen in this group.
On the other hand, all ASA discordant patients had a higher risk of ICU admission > 24 hours, in overall and strati ed analyses. There was no signi cant difference in the discriminative ability between surgeon or anesthesiologist ASA scores for ICU admission > 24 hours. This could possibly re ect differences in opinion being resolved at the point of surgery in favor of the more conservative decision to admit the patient post-operatively to ICU.

Study strengths and limitations
Our study's main strengths are that it was conducted in a large patient cohort spanning multiple years and encompassing the major categories of elective noncardiac surgery. Data collected was from 2017 onwards, after the 2014 ASA score revision and with adequate timelapse for familiarization, and hence does not span periods with potentially different interpretations of the score. The data used was derived from clinical databases, rather than administrative or nancial records. Furthermore, as neither surgeons nor anesthesiologists have nancial incentives tied to ASA scoring at our institution. This eliminates an important source of bias as its presence has been shown to be associated with potential upcoding of the ASA score. [30] A limitation of our study is that the assignment of ASA score by surgeons and anesthesiologists for each patient was not done simultaneously. At our institution, surgeons assign the ASA score when listing the patient for surgery and anesthesiologists assign their score after that at the pre-operative assessment. As such, while the surgeon is completely blinded to the anesthesiologist's score, the anesthesiologist could be aware of the surgeon's score. However, our anesthesiologists generally make an independent assessment of the patient's healthcare status. The anesthesiologist assessment is also closer to the day of surgery than the surgeon's and hence the anesthesiologists' score has better recency. It is also possible that the patient's health could have deteriorated in the period of time between the surgeon and anesthesiologist review, accounting for score discordance and association with poorer outcomes. However, the waiting time for pre-operative assessment at our institution is generally short and most elective surgeries are premised on a relatively stable patient physical status. We do not deem this to be a major source of bias.
As near-contemporaneous ASA scoring was mandatory for both anesthesiologist and surgeon during the study period, potential sources of bias (e.g. recall bias, selection bias) that may affect retrospective studies are much less likely in our study. There was a very small proportion of potential patients (264 patients, < 1%) who had missing anesthesiologist ASA scores. However, as addressed in Additional le 1, this is unlikely to be a major source of bias.
As our study only included patients who underwent elective surgery, its outcomes should not be generalized to emergency cases. Cardiac, burns, and transplant surgery patients were also excluded, and our results may not apply in these groups of patients. Finally, as this was a single center study, this may limit generalizability, particularly in centers where ASA scores impact nancial reimbursements (which is not present in our center) or centers with signi cantly different care patterns or patient comorbidity pro les.

Opportunities for future work
Our study data did not contain information that could individually identify the anesthesiologists or the surgeons assigning ASA scores. As such, we were unable to control for clinician factors that might have in uenced the accuracy of the ASA score, such as level of training and seniority. Our information about comorbidities assessed by the clinicians, which directly impacts the ASA score, was limited to the anesthesiologists only (as there was no standardized assessment form for surgeon-assessed comorbidities during the period of study). Future analyses of ASA discordance may investigate these aspects further, to better understand the mechanisms of ASA discordance and other possible factors that in uence it.
The association of discordant ASA scores with adverse patient outcomes is a cause for concern. Besides further education and reinforcement of standard ASA examples, there may be a need for quality improvement studies to determine if speci c conditions require more detailed or contextualized examples within the institution. Discordant ASA scores may be a red ag for missed comorbidities or interval development of new comorbidities, and mandatory cross-specialty review in ASA discordant cases is a potential intervention to ensure that patients are accurately assessed and appropriately prepared for surgery.

Conclusion
In a large single-center cohort study that was performed after the 2014 revision of the ASA score, there was moderate concordance between ASA scores assigned by anesthesiologists and surgeons in patients undergoing elective surgery. The majority of discordant patients were assigned a lower score by surgeons and is likely due to differences in recognition and grading of comorbidities. Patients with discordant ASA scores, and in particular those assigned lower ASA scores by surgeons, had a higher likelihood of 30-day mortality, 1-year mortality, and ICU admission > 24 hours. Our results suggest a need for improvement in the standardization of ASA scoring and that discordant ASA assessments may be a red ag for missed comorbidities.
Abbreviations ASA American Society of Anesthesiologists ICU Intensive care unit CI Con dence interval AUC Area under the receiver operating characteristic curve Declarations Ethics approval and consent to participate The local Institutional Review Board (CIRB Reference number 2020/2801) granted a waiver of consent due to the use of anonymized routinely collected clinical data and no patient interaction was required. The data analysis and statistical plan was written and led with the Institutional Review Board before the data were accessed.

Consent for publication
Not applicable. Study ow diagram for patient cohort de nition *The exclusions for patients not explicitly coded as elective surgeries and patients scored ASA 5 or 6 are overlapping categories, and as a result sum to more than the difference between the rst two steps Figure 2 Odds Ratio Plots for Risk of Adverse Outcomes with Different Levels of ASA Discordance (2A) Odds Ratio for death within 30 days; (2B) Odds Ratio for death within 1 year; (2C) Odds Ratio for ICU admission > 24 hours. A lower surgeon ASA score as compared to the anesthesiologist score was associated with all three outcomes. On the other hand, a lower anesthesiologist ASA score was only associated with ICU admission >24 hours but not death at 30 days or 1 year.

Figure 3
Composite plot of AUCs for prediction of adverse outcomes using surgeon and anesthesiologist ASA scores (3A) AUC of 30-day mortality for anesthesiologist and surgeon-assigned ASA scores (3B) AUC of 1-year mortality for anesthesiologist and surgeon-assigned ASA scores (3C) AUC of ICU admission > 24 hours for anesthesiologist and surgeon-assigned ASA scores

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.