*Screening and study inclusion:*

Study screening and inclusion is summarized in **Figure 1**. In brief, in the identification stage, we initially identified 7,555 references in the initial search, of which, 208 were excluded for being duplicates. In the title and abstract review phase, reviewers excluded 5,854 studies that were deemed ineligible. Full-text reviews were conducted for 1,493 articles that were deemed eligible from title and abstract review. Of the full-text reviewed articles, 1105 studies were excluded for not meeting eligibility criteria. The most common reasons for exclusion were: scales or measures that were outside the scope of review (n=386), lack of psychometric data on scales of interests (n=140), lab or methods papers that were outside the scope of the review (n=130), non-English language publications (n=110), duplicate study (n=98), psychometric outcomes that were outside the scope of review (n=79). In total, there were 387 unique studies included in the data extraction phase containing sufficient data on the outcomes for 37 scales **(Table 1)**.

*Study Characteristics:*

**Table 2 **presents** **characteristics of the studies included in this meta-analysis. As mentioned, studies published in English were included in this review, regardless of the language in which the scales were administered. Among the 387 studies included, the most those common language in which the scale/measure was conducted in was English (63%), followed by Spanish (9%), French (5%), Portuguese (3%), and Chinese (2%). A large proportion of studies were conducted in the United States (40%). The median sample size was 286 [Range=9-50,049]. The vast majority of studies (83%) included men and women (n=323). Additionally, 11% (n=42) of the studies included study sample comprised only of men, while 5% (n=20) studies included study samples comprised only of women. Most studies were published after 1999 (66%), with studies published between 2000-2009 accounting for 38% (n=148) of the studies meta-analyzed, and studies published between 2010-2017 accounting for 28% (n=110). Most studies involved a single study site 61%, while 39% were multi-site studies. Additionally, 72% of the studies involved convenience samples, 20% included random or probability based samples, and 7% had other or unclear sampling strategies.

*Assessment of bias in Study Quality:*

The risk of bias in the four QUADAS 2 domains for each study included in this meta-analysis is presented in **Table 2**. Of the studies included, 58% of studies had a low risk of bias with respect to the patient population; 57% has low risk of bias in the index test domain, 48% has low risk of bias in the reference standard test domain, and 72% had low risk for the flow and timing. Overall, only 16% of studies had low risk of bias across all four of these QUADAS 2 domains.

*Pooled Summary Estimates: Overall findings:*

The pooled summary estimates of psychometric properties of substance use measures (which are described in **Table 1**) are qualitatively summarized in **Table 3**. Overall, 65% of pooled estimates for alpha were in the range of fair-to-excellent; 44% of estimates for kappa were in the range of fair-to-excellent. In addition, 69%, 97%, 37% and 96% of pooled estimates for sensitivity, specificity, positive predictive value, and negative predictive value, respectively, were in the range of moderate-to-excellent.

Self-reported measures that had all pooled estimates that were fair/moderate or better include the following: Alcohol Dependence Scale; Addiction Severity Index (ASI); ASI subscale for Alcohol; ASSIST; the Composite International Diagnostic Interview; Drug Abuse Screen Test - 10 item scale; Drug Use Disorders Identification Test; Problem Oriented Screening Instrument for Teenagers; Severity of Dependence scale; Timeline Followback; and Chemical Use, Abuse, and Dependence. Biomarkers that had all pooled estimates that were fair/moderate or better include the following: Ethyl glucuronide; Phosphatidylethanol test; and the combined used of Carbohydrate deficient transferrin and Mean corpuscular volume. In general, we also observed high heterogeneity between studies for most pooled estimates.

*Pooled Summary Estimates, by Substance Use Measure:*

The pooled estimates and 95% confidence intervals for alpha, kappa, sensitivity, specificity, positive predictive value, and negative predictive value are shown in **Tables 4, 5, 6, 7, 8, and 9,** respectively. Below we summarize the results of the pooled summary estimates alphabetically for each of the 37 substance use measures, grouping self-reported measures and biomarkers separately. The list of references for the studies meta-analyzed for each scale/measure is presented in **Table 10**.

__Self-Reported Measures:__

**Alcohol Dependence Scale (ADS)**

The pooled alpha estimate for ADS (3 data points) was good: 0.90 (95%CI=0.80-0.99) and there was high heterogeneity between studies (I2 98.9%). The pooled sensitivity estimate for ADS (2 data points) was excellent: 0.95 (95%CI=0.90-1.00) and there was low heterogeneity between studies (I2 0%). The pooled specificity estimate (2 data points) was moderate: 0.64 (95%CI=0.52-0.77) and there was moderate heterogeneity between studies (I2 60.1%). There was insufficient data to calculate the pooled PPV and NPV estimates for ADS.

**Addiction Severity Index (ASI) **

The pooled alpha estimate for ASI (3 data points) was good: 0.84 (95%CI=0.81-0.87) and there was moderate heterogeneity between studies (I2 38.5%). There was insufficient data to calculate pooled kappa, sensitivity, specificity, PPV, and NPV estimates.

**Addiction Severity Index-Alcohol (alcohol sub-scale; ASI-A)**

The pooled alpha estimate (18 data points) was moderate: 0.77 (95%CI=0.73-0.81) and there was high heterogeneity between studies (I2 94.3%). The pooled sensitivity estimate for ASI-A (6 data points) was good: 0.83 (95%CI=0.67-0.92) and there was high heterogeneity between studies (I2 87.6%). The pooled specificity estimate for ASI-A (6 data points) was moderate: 0.79 (95%CI=0.67-0.88) and there was high heterogeneity between studies (I2 91.2%). There was insufficient data to calculate pooled kappa, PPV and NPV estimates for ASI-A.

**Addiction Severity Index-Drugs (drugs sub-scale; ASI-D) **

The pooled alpha estimate for ASI-D (16 data points) was unsatisfactory: 0.68 (95%CI=0.63-0.74) and there was high heterogeneity between studies (I2 95.6%). The pooled sensitivity estimate (5 data points) was good: 0.86 (95%CI=0.83-0.89) and there was moderate heterogeneity between studies (I2 62.5%). The pooled specificity estimate (5 data points) was good: 0.85 (95%CI=0.77-0.91) and there was high heterogeneity between studies (I2 86%). There was insufficient data to calculate the pooled kappa, PPV and NPV estimates.

**The Alcohol, Smoking, and Substance Involvement Screening Test (ASSIST)**

The pooled alpha estimate (7 data points) was good: 0.85 (95%CI=0.80-0.91) and there was high heterogeneity between studies (I2 94%). The pooled sensitivity estimate (2 data points) was good: 0.83 (95%CI=0.80-0.87) and there was low heterogeneity between studies (I2 0%). The pooled specificity estimate (2 data points) was moderate: 0.73 (95%CI=0.57-0.88) and there was high heterogeneity between studies (I2 91%). There was insufficient data to calculate the pooled estimate for kappa, PPV, and NPV.

**Alcohol Use Disorders Identification Test (AUDIT)**

The pooled alpha estimate for AUDIT (80 data points) was moderate: 0.85 (95%CI=0.83-0.87) and there was high heterogeneity between studies (I2 98%). The pooled kappa estimate for AUDIT (4 data points) was unsatisfactory: 0.46 (95%CI=0.25-0.67) and there was high heterogeneity between studies (I2 0.99). The pooled sensitivity estimate for AUDIT (135 data points) was good: 0.86 (95%CI=0.84-0.88) and there was high heterogeneity between studies (I2 97%). The pooled specificity estimate for AUDIT (135 data points) was good: 0.87 (95%CI=0.85-0.89) and there was high heterogeneity between studies (I2 99%). The pooled PPV estimate for AUDIT (65 data points) was moderate: 0.61 (95%CI=0.51-0.71) and there was high heterogeneity between studies (I2 99%). The pooled NPV estimate for AUDIT (54 data points) was excellent: 0.94 (95%CI=0.93-0.95) and there was high heterogeneity between studies (I2 96%)

**Alcohol Use Disorders Identification Test-3 (AUDIT-3)**

Alpha cannot be calculated for AUDIT-3 because it is a single-item measure. There was insufficient data to calculate the pooled estimate for kappa. The pooled sensitivity estimate for AUDIT-3 (22 data points) was good: 0.84 (95%CI=0.80-0.88) and there was high heterogeneity between studies (I2 90%). The pooled specificity estimate for AUDIT-3 (22 data points) was good: 0.84 (95%CI=0.75-0.90) and there was high heterogeneity between studies (I2 99%). The pooled PPV estimate for AUDIT-3 (9 data points) was moderate: 0.63 (95%CI=0.49-0.77) and there was high heterogeneity between studies (I2 99%). The pooled NPV estimate (7 data points) was excellent: 0.94 (95%CI=0.90-0.98) and there was high heterogeneity between studies (I2 95%).

**Alcohol Use Disorders Identification Test-C (AUDIT-C) **

The pooled alpha estimate for AUDIT-C (20 data points) was fair: 0.75 (95%CI=0.70-0.80) and there was high heterogeneity between studies (I2 99%). The pooled kappa estimate for AUDIT-C (2 data points) was unsatisfactory: 0.41 (95%CI=0.39-0.43) and there was low heterogeneity between studies (I2 0%). The pooled sensitivity estimate for AUDIT-C (45 data points) was good: 0.87 (95%CI=0.84-0.90) and there was high heterogeneity between studies (I2 99%). The pooled specificity estimate for AUDIT-C (45 data points) was good: 0.84 (95%CI=0.81-0.87) and there was high heterogeneity between studies (I2 99%). The pooled PPV estimate for AUDIT-C (22 data points) was low: 0.50 (95%CI=0.39 -0.60) and there was high heterogeneity between studies (I2 99%). The pooled NPV estimate for AUDIT-C (19 data points) was good: 0.88 (95%CI=0.83-0.92) and there was high heterogeneity between studies (I2 99%).

**Brief Michigan Alcoholism Screening Test (B-MAST) **

There was insufficient data to calculate the pooled estimate for B-MAST’s alpha and kappa. The pooled sensitivity estimate for B-MAST (21 data points) was low: 0.50 (95%CI=0.38-0.62) and there was high heterogeneity between studies (I2 99%). The pooled specificity estimate for B-MAST (21 data points) was excellent: 0.97 (95%CI=0.96-0.98) and there was high heterogeneity between studies (I2 97%). The pooled PPV estimate for B-MAST (3 data points) was moderate: 0.65 (95%CI=0.38-0.93) and there was high heterogeneity between studies (I2 99%). The pooled NPV estimate for B-MAST (2 data points) was excellent: 0.90 (95%CI=0.87-0.94) and there was moderate heterogeneity between studies (I2 33%).

**Cut down, Annoyed, Guilty, Eye-opener (CAGE) **

The pooled alpha estimate for CAGE (22 data points) was unsatisfactory: 0.70 (95%CI=0.65-0.75) and there was high heterogeneity between studies (I2 98%). The pooled kappa estimate for CAGE (3 data points) was unsatisfactory: 0.57 (95%CI=0.34-0.81) and there was high heterogeneity between studies (I2 0.97). The pooled sensitivity estimate for CAGE (139 data points) was moderate: 0.70 (95%CI=0.66-0.74) and there was high heterogeneity between studies (I2 98%). The pooled specificity estimate for CAGE (139 data points) was good: 0.90 (95%CI=0.88-0.91) and there was high heterogeneity between studies (I2 99%). The pooled PPV estimate for CAGE (61 data points) was low: 0.51 (95%CI=0.45-0.58) and there was high heterogeneity between studies (I2 99%). The pooled NPV estimate for CAGE (39 data points) was excellent: 0.91 (95%CI=0.88-0.93) and there was high heterogeneity between studies (I2 97%).

**Composite International Diagnostic Interview (CIDI)**

Alpha coefficients are not calculated for CIDI. The pooled kappa estimate for CIDI (2 data points) was moderate: 0.82 (95%CI=0.61-1.02) and there was high heterogeneity between studies (I2 0.78). The pooled sensitivity estimate for CIDI (3 data points) was moderate: 0.80 (95%CI=0.67-0.92) and there was high heterogeneity between studies (I2 80%). The pooled specificity estimate for CIDI (3 data points) was good: 0.86 (95%CI=0.77-0.95) and there was high heterogeneity between studies (I2 99%). The pooled PPV estimate for CIDI (2 data points) was moderate: 0.69 (95%CI=0.26-1.00) and there was high heterogeneity between studies (I2 99%). The pooled NPV estimate for CIDI (2 data points) was good: 0.89 (95%CI=0.68-1.00) and there was high heterogeneity between studies (I2 96%).

**Car, Relax, Alone, Forget, Friends, Trouble (CRAFFT)**

The pooled alpha estimate for CRAFFT (6 data points) was unsatisfactory: 0.69 (95%CI=0.64-0.74) and there was high heterogeneity between studies (I2 83%). There was insufficient data to calculate the pooled estimate for kappa for CRAFFT. The pooled sensitivity estimate for CRAFFT (10 data points) was good: 0.90 (95%CI=0.84-0.94) and there was high heterogeneity between studies (I2 97%). The pooled specificity estimate for CRAFFT (10 data points) was moderate: 0.76 (95%CI=0.68-0.83) and there was high heterogeneity between studies (I2 97%). The pooled PPV estimate for CRAFFT (8 data points) was low: 0.57 (95%CI=0.34-0.80) and there was high heterogeneity between studies (I2 99%). The pooled NPV estimate for CRAFFT (8 data points) was good: 0.86 (95%CI=0.45-1.00) and there was high heterogeneity between studies (I2 99%)

**Drug Abuse Screen Test (DAST) **

The pooled alpha estimate for DAST (6 data points) was excellent: 0.94 (95%CI=0.93-0.95) and there was low heterogeneity between studies (I2 0%). The pooled kappa estimate for DAST (2 data points) was moderate: 0.83 (95%CI=0.58-1.00) and there was high heterogeneity between studies (I2 0.98). The pooled sensitivity estimate for DAST (7 data points) was good: 0.85 (95%CI=0.74-0.92) and there was high heterogeneity between studies (I2 89%). The pooled specificity estimate for DAST (7 data points) was good: 0.84 (95%CI=0.68-0.93) and there was high heterogeneity between studies (I2 97%). The pooled PPV estimate for DAST (5 data points) was low: 0.51 (95%CI=0.32-0.70) and there was high heterogeneity between studies (I2 98%). The pooled NPV estimate for DAST (4 data points) was excellent: 0.95 (95%CI=0.89-1.00) and there was high heterogeneity between studies (I2 81%).

**Drug Abuse Screen Test - 10-item version (DAST-10)**

The pooled alpha estimate DAST-10 (6 data points) was fair: 0.79 (95%CI=0.68-0.89) and there was high heterogeneity between studies (I2 98%). There was insufficient data to calculate the pooled estimate for kappa for DAST-10. The pooled sensitivity estimate for DAST-10 (6 data points) was excellent: 0.90 (95%CI=0.75-0.97) and there was high heterogeneity between studies (I2 95%). The pooled specificity estimate for DAST-10 (6 data points) was good: 0.82 (95%CI=0.72-0.89) and there was high heterogeneity between studies (I2 92%). The pooled PPV estimate for DAST-10 (4 data points) was good: 0.80 (95%CI=0.70-0.91) and there was high heterogeneity between studies (I2 99%). The pooled NPV estimate for DAST-10 (4 data points) was good: 0.86 (95%CI=0.81-0.91) and there was moderate heterogeneity between studies (I2 40%).

**Drug Use Disorders Identification Test (DUDIT) **

The pooled alpha estimate for DUDIT (15 data points) was excellent: 0.92 (95%CI=0.90-0.95) and there was high heterogeneity between studies (I2 96%). There was insufficient data to calculate the pooled kappa estimate for DUDIT. The pooled sensitivity estimate for DUDIT (12 data points) was excellent: 0.93 (95%CI=0.89-0.96) and there was high heterogeneity between studies (I2 76%). The pooled specificity estimate for DUDIT (12 data points) was moderate: 0.79 (95%CI=0.67-0.87) and there was high heterogeneity between studies (I2 96%). The pooled PPV estimate (5 data points) was moderate: 0.61 (95%CI=0.34-0.87) and there was high heterogeneity between studies (I2 99%). The pooled NPV estimate (5 data points) was excellent: 0.92 (95%CI=0.82-1.00) and there was high heterogeneity between studies (I2 78%).

**Michigan Alcohol Screening Test (MAST)**

The pooled alpha estimate for MAST (8 data points) was moderate: 0.82 (95%CI=0.78-0.86) and there was high heterogeneity between studies (I2 83%). The pooled kappa estimate for MAST (4 data points) was unsatisfactory: 0.69 (95%CI=0.58-0.81) and there was high heterogeneity between studies (I2 0.88). The pooled sensitivity estimate for MAST (12 data points) was moderate: 0.70 (95%CI=0.58-0.80) and there was high heterogeneity between studies (I2 95%). The pooled specificity estimate for MAST (12 data points) was good: 0.85 (95%CI=0.77-0.91) and there was high heterogeneity between studies (I2 97%). The pooled PPV estimate for MAST (9 data points) was low: 0.51 (95%CI=0.30-0.71) and there was high heterogeneity between studies (I2 98%). The pooled NPV estimate for MAST (6 data points) was good: 0.88 (95%CI=0.82-0.94) and there was high heterogeneity between studies (I2 92%).

**Problem Oriented Screening Instrument for Teenagers (POSIT)**

The pooled alpha estimate for POSIT (2 data points) was good: 0.86 (95%CI=0.73-0.98) and there was high heterogeneity between studies (I2 94%). The pooled sensitivity estimate for POSIT (3 data points) was good: 0.84 (95%CI=0.72-0.96) and there was high heterogeneity between studies (I2 90%). The pooled specificity estimate for POSIT (3 data points) was good: 0.82 (95%CI=0.75-0.90) and there was high heterogeneity between studies (I2 88%). There was insufficient data to calculate the pooled kappa, PPV, and NPV estimates for POSIT.

**Self-Administered Alcoholism Screening Test** **(SAAST)**

The pooled alpha estimate for SAAST (2 data points) was good: 0.89 (95%CI=0.79-0.99) and there was high heterogeneity between studies (I2 95%). The pooled sensitivity estimate for SAAST (7 data points) was low: 0.52 (95%CI=0.33-0.71) and there was high heterogeneity between studies (I2 98%). The pooled specificity estimate (7 data points) was good: 0.83 (95%CI=0.76-0.90) and there was high heterogeneity between studies (I2 98%). The pooled PPV estimate for SAAST (6 data points) was low: 0.32 (95%CI=0.22-0.42) and there was high heterogeneity between studies (I2 95%). The pooled NPV estimate for SAAST (6 data points) was excellent: 0.92 (95%CI=0.89-0.95) and there was high heterogeneity between studies (I2 92%). There was insufficient data to calculate the pooled kappa estimates for SAAST.

**Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA) **

There are no alpha coefficients associated with semi-structures assessments such as SSADDA. The pooled kappa estimate for SSADDA (8 data points) was moderate: 0.84 (95%CI=0.77-0.91) and there was high heterogeneity between studies (I2 0.97). There was insufficient data to calculate the pooled sensitivity, specificity, PPV and NPV estimates for SSADDA.

**Severity of Dependence (SDS) **

The pooled alpha estimate for SDS (6 data points) was good: 0.86 (95%CI=0.78-0.93) and there was high heterogeneity between studies (I2 95%). The pooled sensitivity estimate for SDS (6 data points) was good: 0.83 (95%CI=0.76-0.90) and there was high heterogeneity between studies (I2 77%). The pooled specificity estimate (6 data points) was good: 0.84 (95%CI=0.78-0.89) and there was moderate heterogeneity between studies (I2 44%). The pooled PPV estimate for SDS (3 data points) was good: 0.90 (95%CI=0.86-0.94) and there was low heterogeneity between studies (I2 0%). The pooled NPV estimate for SDS (3 data points) was good: 0.83 (95%CI=0.76-0.89) and there was low heterogeneity between studies (I2 3.5%). There was insufficient data to calculate the pooled kappa estimate for SDS.

**Tolerance-Annoyance Cut Down Eye Opener** **(T-ACE)**

The pooled alpha estimate for T-ACE (2 data points) was unsatisfactory: 0.50 (95%CI=0.47-0.52) and there was high heterogeneity between studies (I2 29%). The pooled sensitivity estimate for T-ACE (8 data points) was good: 0.83 (95%CI=0.74-0.92) and there was high heterogeneity between studies (I2 96%). The pooled specificity estimate for T-ACE (8 data points) was moderate: 0.72 (95%CI=0.65-0.79) and there was high heterogeneity between studies (I2 98%). The pooled PPV estimate for T-ACE (6 data points) was low: 0.35 (95%CI=0.25-0.45) and there was high heterogeneity between studies (I2 99%). The pooled NPV estimate for T-ACE (2 data points) was good: 0.87 (95%CI=0.62-1.00) and there was high heterogeneity between studies (I2 97%). There was insufficient data to calculate the pooled estimate for kappa for T-ACE.

**Timeline Followback (TLFB)**

There are no alpha coefficients associated with TLFB. The pooled kappa estimate for TLFB (3 data points) was good: 0.86 (95%CI=0.81-0.91) and there was high heterogeneity between studies (I2 0.88). The pooled sensitivity estimate for TLFB (4 data points) was moderate: 0.80 (95%CI=0.73-0.87) and there was moderate heterogeneity between studies (I2 63%). The pooled specificity estimate for TLFB (3 data points) was excellent: 0.97 (95%CI=0.95-0.99) and there was low heterogeneity between studies (I2 0%). There was insufficient data to calculate the pooled estimate for PPV and NPV for TLFB.

**Tolerance, Worried, Eye-Opener, Amnesia, Cut down** **(TWEAK)**

The pooled alpha estimate for TWEAK (3 data points) was unsatisfactory: 0.62 (95%CI=0.55-0.69) and there was high heterogeneity between studies (I2 86%). The pooled sensitivity estimate for TWEAK (36 data points) was good: 0.85 (95%CI=0.80-0.89) and there was high heterogeneity between studies (I2 96%). The pooled specificity estimate for TWEAK (36 data points) was good: 0.86 (95%CI=0.82-0.90) and there was high heterogeneity between studies (I2 99%). The pooled PPV estimate for TWEAK (5 data points) was low: 0.43 (95%CI=0.26-0.61) and there was high heterogeneity between studies (I2 99%). The pooled NPV estimate for TWEAK (2 data points) was good: 0.88 (95%CI=0.70-1.00) and there was high heterogeneity between studies (I2 95%). There was insufficient data to calculate the pooled estimate for kappa for TWEAK.

**The Chemical Use, Abuse, and Dependence (CUAD)**

The pooled alpha estimate for CUAD (3 data points) was excellent: 0.96 (95%CI=0.94-0.98) and there was high heterogeneity between studies (I2 %). There was insufficient data to calculate the pooled estimate for kappa, sensitivity, specificity, PPV, and NPV for CUAD.

__Biomarkers: __

**Alanine transaminase (ALT) **

The pooled sensitivity estimate for ALT (32 data points) was low: 0.32 (95%CI=0.24-0.40) and there was high heterogeneity between studies (I2 96.1%). The pooled specificity estimate for ALT (32 data points) was good: 0.88 (95%CI=0.83-0.92) and there was high heterogeneity between studies (I2 95.8%). The pooled PPV estimate for ALT (7 data points) was low 0.37 (95%CI=0.18-0.56) and there was high heterogeneity between studies (I2 96.1%). The pooled NPV estimate for ALT (4 data points) was moderate: 0.63 (95%CI=0.42-0.85) and there was high heterogeneity between studies (I2 97.5%).

**Aspartate transaminase (AST) **

The pooled sensitivity estimate for AST (33 data points) was low: 0.48 (95%CI=0.40-0.55) and there was high heterogeneity between studies (I2 97%). The pooled specificity estimate for AST (33 data points) was good: 0.86 (95%CI=0.81-0.90) and there was high heterogeneity between studies (I2 97%). The pooled PPV estimate for AST (8 data points) was low: 0.42 (95%CI=0.27-0.57) and there was high heterogeneity between studies (I2 93%). The pooled NPV estimate for AST (6 data points) was moderate: 0.69 (95%CI=0.55-0.83) and there was high heterogeneity between studies (I2 95%).

**Aspartate transaminase, Alanine transaminase ratio (AST/ALT ratio)**

The pooled sensitivity estimate for AST/ALT ratio (6 data points) was low: 0.34 (95%CI=0.22-0.46) and there was high heterogeneity between studies (I2 96%). The pooled specificity estimate (4 data points) was moderate: 0.73 (95%CI=0.52-0.94) and there was high heterogeneity between studies (I2 98%). There was insufficient data to calculate the pooled estimate for PPV and NPV.

**Blood alcohol concentration (BAC) **

The pooled sensitivity estimate for BAC (5 data points) was moderate: 0.64 (95%CI=0.59-0.69) and there was moderate heterogeneity between studies (I2 44%). The pooled specificity estimate for BAC (5 data points) was moderate: 0.80 (95%CI=0.72-0.87) and there was high heterogeneity between studies (I2 93%). The pooled PPV estimate for BAC (3 data points) was low: 0.60 (95%CI=0.15-1.00) and there was high heterogeneity between studies (I2 98%). The pooled NPV estimate for BAC (3 data points) was moderate: 0.69 (95%CI=0.52-0.86) and there was high heterogeneity between studies (I2 93%).

**Carbohydrate deficient transferrin (CDT) **

There are no alpha and kappa coefficients associated with biomarkers such as CDT. The pooled sensitivity estimate for CDT (8 data points) was low: 0.59 (95%CI=0.43-0.73) and there was high heterogeneity between studies (I2 97%). The pooled specificity estimate for CDT (8 data points) was excellent: 0.96 (95%CI=0.93-0.98) and there was moderate heterogeneity between studies (I2 72%). The pooled PPV estimate for CDT (6 data points) was good: 0.85 (95%CI=0.74-0.97) and there was high heterogeneity between studies (I2 76%). The pooled NPV estimate for CDT (6 data points) was moderate: 0.79 (95%CI=0.73-0.85) and there was high heterogeneity between studies (I2 96%).

**Carbohydrate deficient transferrin-Tech** **(CDTech)**

There are no alpha and kappa coefficients associated with biomarkers such as CDTech. The pooled sensitivity estimate for CDTech (41 data points) was low: 0.54 (95%CI=0.45-0.62) and there was high heterogeneity between studies (I2 99%). The pooled specificity estimate for CDTech (41 data points) was good: 0.89 (95%CI=0.88-0.91) and there was high heterogeneity between studies (I2 88%). The pooled PPV estimate for CDTech (12 data points) was low: 0.52 (95%CI=0.37-0.67) and there was high heterogeneity between studies (I2 95%). The pooled NPV estimate for CDTech (8 data points) was moderate: 0.80 (95%CI=0.61-0.98) and there was high heterogeneity between studies (I2 99%).

**Carbohydrate deficient transferrin with Mean corpuscular volume (CDT with MCV)**

There are no alpha and kappa coefficients associated with biomarkers such as CDT and MCV. The pooled sensitivity estimate for CDT with MCV (8 data points) was moderate: 0.74 (95%CI=0.60-0.88) and there was high heterogeneity between studies (I2 98%). The pooled specificity estimate for CDT with MCV (4 data points) was excellent: 0.93 (95%CI=0.91-0.95) and there was low heterogeneity between studies (I2 0%). The pooled PPV estimate for CDT with MCV (4 data points) was moderate: 0.74 (95%CI=0.51-0.97) and there was high heterogeneity between studies (I2 98%). The pooled NPV estimate for CDT with MCV (4 data points) was excellent: 0.92 (95%CI=0.83-1.00) and there was high heterogeneity between studies (I2 95%)

**Gamma-Glutamyl Transferase (GGT) **

There are no alpha and kappa coefficients associated with biomarkers such as GGT. The pooled sensitivity estimate for GGT (76 data points) was low: 0.57 (95%CI=0.50-0.64) and there was high heterogeneity between studies (I2 99%). The pooled specificity estimate for GGT (76 data points) was good: 0.83 (95%CI=0.78-0.86) and there was high heterogeneity between studies (I2 98%). The pooled PPV estimate for GGT (30 data points) was low: 0.43 (95%CI=0.35-0.51) and there was high heterogeneity between studies (I2 97%). The pooled NPV estimate for GGT (23 data points) was good: 0.82 (95%CI=0.70-0.94) and there was high heterogeneity between studies (I2 99%).

**Gamma-Glutamyl Transferase with Mean corpuscular volume (GGT with MCV)**

There are no alpha and kappa coefficients associated with biomarkers such as GGT and MCV. The pooled sensitivity estimate for GGT with MCV (10 data points) was moderate: 0.64 (95%CI=0.38-0.84) and there was high heterogeneity between studies (I2 99%). The pooled specificity estimate for GGT with MCV (10 data points) was good: 0.87 (95%CI=0.76-0.93) and there was high heterogeneity between studies (I2 97%). The pooled PPV estimate for GGT with MCV (6 data points) was low: 0.47 (95%CI=0.28-0.66) and there was high heterogeneity between studies (I2 98%). The pooled NPV estimate for GGT with MCV (6 data points) was good: 0.88 (95%CI=0.81-0.95) and there was high heterogeneity between studies (I2 94%)

**Ethyl glucuronide (EtG) **

There are no alpha and kappa coefficients associated with biomarkers such as EtG. The pooled sensitivity estimate for EtG (6 data points) was good: 0.83 (95%CI=0.61-0.94) and there was high heterogeneity between studies (I2 91%). The pooled specificity estimate for EtG (6 data points) was excellent: 0.95 (95%CI=0.90-0.98) and there was high heterogeneity between studies (I2 66%). The pooled PPV estimate for EtG (2 data points) was moderate: 0.61 (95%CI=0.39-0.84) and there was moderate heterogeneity between studies (I2 58%). The pooled NPV estimate for EtG (2 data points) was good: 0.86 (95%CI=0.78-0.94) and there was moderate heterogeneity between studies (I2 60%).

**Mean corpuscular volume (MCV) **

There are no alpha and kappa coefficients associated with biomarkers such as MCV. The pooled sensitivity estimate for MCV (55 data points) was low: 0.39 (95%CI=0.33-0.45) and there was high heterogeneity between studies (I2 97%). The pooled specificity estimate for MCV (55 data points) was excellent: 0.91 (95%CI=0.88-0.93) and there was high heterogeneity between studies (I2 98%). The pooled PPV estimate for MCV (28 data points) was low: 0.48 (95%CI=0.36-0.59) and there was high heterogeneity between studies (I2 98%). The pooled NPV estimate for MCV (22 data points) was moderate: 0.79 (95%CI=0.73-0.86) and there was high heterogeneity between studies (I2 99%).

**Percent Carbohydrate deficient transferrin (%CDT)**

The pooled sensitivity estimate for %CDT (40 data points) was low: 0.56 (95%CI=0.47-0.65) and there was high heterogeneity between studies (I2 98.2%). The pooled specificity estimate for %CDT (40 data points) was 0.91, which is considered as excellent (95%CI=0.88-0.94) and there was high heterogeneity between studies (I2 97%). The pooled PPV estimate for %CDT (13 data points) was low: 0.58 (95%CI=0.38-0.78) and there was high heterogeneity between studies (I2 98.5%). The pooled NPV estimate for %CDT (13 data points) was good: 0.85 (95%CI=0.78-0.92) and there was high heterogeneity between studies (I2 97.6%).

**Phosphatidylethanol (PEth) **

There are no alpha and kappa coefficients associated with biomarkers such as PEth. The pooled sensitivity estimate for PEth (7 data points) was good: 0.87 (95%CI=0.79-0.96) and there was high heterogeneity between studies (I2 94%). The pooled specificity estimate for PEth (4 data points) was excellent: 0.94 (95%CI=0.91-0.97) and there was moderate heterogeneity between studies (I2 31%). There was insufficient data to calculate the pooled estimate for PPV and NPV for PEth