This review follows standard methodology for systematic reviews reported in a previous study as well as in the York Centre for Reviews and Dissemination guidelines [17,18].
Search strategy
Mixed search strategy involving both electronic and manual databases [19, 20] was adopted in this study. The search strategy was designed with the help of IHR specialist librarian (DA), and relevant guidance was drawn from "Systematic Reviews: Centre for Reviews and Dissemination guidance for undertaking reviews in health care”, and “Systematic Reviews to Support Evidence-Based Medicine”[17, 18] to identify any relevant studies of metabolic syndrome risk models and scores. The final search strategy was implemented by MI and was double-checked by DP, GR and YP. The final search was conducted on 21 September 2018.
The literature was searched using keywords which includes: predict, screen, risk, score, metabolic syndrome, insulin resistance syndrome, model, regression, risk assessment, risk factor, calculator, analysis, sensitivity and specificity, ROC and odds ratio. Both MESH terms and text words were used. Articles were searched using titles and abstracts, the search was limited to studies conducted in English language, but no date restriction was applied.
The details of the search strategies used can be found in supplementary material 1.0.
The literature search was conducted in databases such as MEDLINE, CINAHL, Web of Science and PUBMED.
Eligibility criteria
We included peer-reviewed studies that either combine two or more known risk factors to derive a metabolic syndrome risk model or score, or validated a pre-existing model on a different population or conducted both. Furthermore, the main outcome of this review is metabolic syndrome, and the secondary outcomes are any related predictive outcomes (discrimination and calibration inclusive). Finally, this review only includes studies published in English.
We excluded studies on screening and early detection, genetic mutation models, conducted on animals, investigating one or more single risk factors which are not connected to build a model or score, studies that applied other disease model or score to predict MetS. Also, studies whose main outcome is not metabolic syndrome, studies that did not report any related predictive outcomes (either discrimination or calibration). Finally, studies conducted in languages other than English.
Selection of studies
A total of 16821 titles were transferred into the electronic reference software Endnote version 8 (endnote.com), and duplicates were removed automatically, resulting in a total of 15222 titles. The duplicate titles that were not removed automatically by endnote were removed manually, resulting in 15102 titles.
The entire 15102 titles were scanned by MI, and if the title was suspected to represent a paper that met the inclusion/ exclusion criteria, the entire abstract was reviewed.
Title scanning and abstract review was completed in November 2018. A total of 66 titles were marked as potentially meeting the inclusion criteria. Out of these, ten (10) titles were double-checked by DP.
The full paper review was conducted by applying the inclusion/exclusion criteria to the retrieved articles. At this stage, studies were excluded because of the following: predicting genetics (1), investigating one or more single risk factors which are not connected to build a model or score (20), used unconventional predictors (alternative medicine) (1), applied other disease model or score to predict MetS (CVD, T2DM) (4)main outcome is not metabolic syndrome (8) did not report any related predictive outcomes (either discrimination or calibration) (7) conducted in languages other than English (2). This reduced the number of full papers to 23.
Full papers from other sources
In order to identify more relevant articles, a manual search of the reference lists of all the selected articles was conducted. Furthermore, relevant “grey literature” was searched for in the following: The Grey Literature Report (www.greylit.org/),, OpenGrey (www.opengrey.eu/) and OAISTER (www.oaister.org/)..
From the above, three further papers were added from the initial scoping search, and three from the reference lists of the included papers. However, the search for the grey literature yielded no result. This makes the total number of articles selected for data extraction to 29.
The selection process is summarised using the PRISMA flow diagram (PRISMA 2009) (see figure 1.0).
Data extraction
Data extraction was conducted using a standard form adopted from a similar study [21], and saved in Microsoft Excel 2016. The extracted data were on those variables relevant to the review question and which satisfied the conditions for the narrative synthesis conducted. It is noteworthy that, some of the studies presented several models with each model composed of different risk factors. However, it is beyond the capacity of this researcher to study in details each of those models. Furthermore, the researchers themselves often conclude that one of their reported models is obviously better than the others in terms of performance. Therefore, where this is the case, data from the authors’ preferred model(s) or (if no clear preference was stated in the article), the one judged to be more detailed or robust statistically was extracted. During the data extraction, a total of five studies were excluded, leaving 24 articles.
The primary data extraction was conducted by MI and double-checked by DP, GR and YP and discrepancies were resolved by discussions.
Assessment of methodological quality
The PROBAST (Prediction study Risk of Bias Assessment Tool) [11], a tool for assessing the risk of bias and applicability of prognostic model studies, was used to assess the quality (risk of bias and applicability) of included studies. Briefly, the PROBAST is a tool recently developed to assess the quality of primary studies included in a systematic review. It evaluates both risk of bias and issues concerning applicability of studies that develop, validate or update a multivariable model (both diagnostic and prognostic).
Furthermore, PROBAST comprises of 4 domains covering 20 signalling questions to enable risk of bias assessment and applicability. These domains are concerned about participants such as (the study design used, whether appropriate inclusion/ exclusion criteria were used), the predictors used, outcome, and how the analysis was conducted.
Aside from its specific purpose of appraising studies in systematic reviews of prediction models, PROBAST can also be utilised in the general critical appraisal of primary prediction model studies. Noteworthy, PROBAST is not meant for generating summary “quality score” due to the documented drawbacks related to such scores [22]. Therefore, the effect of problems observed within each domain should be discussed by users [21].
Table 1.0
Quality assessment of the included studies based on PROBAST
Study
|
|
ROB
|
|
|
Applicability
|
|
|
Overall
|
|
|
Participants
|
Predictors
|
Outcome
|
Analysis
|
Participants
|
Predictors
|
Outcome
|
ROB
|
Applicability
|
Soldatovic et al. 201624
|
-
|
?
|
-
|
-
|
+
|
?
|
+
|
-
|
?
|
Graziano et al. 201525
|
-
|
?
|
-
|
-
|
+
|
?
|
+
|
-
|
?
|
Durado villa et al. 201526
|
-
|
?
|
?
|
?
|
+
|
+
|
+
|
-
|
?
|
Okosun et al. 201027
|
-
|
?
|
?
|
-
|
+
|
+
|
?
|
-
|
?
|
Pandit et al. 201128
|
-
|
?
|
-
|
-
|
+
|
?
|
+
|
-
|
?
|
Hosseini et al. 201429
|
-
|
?
|
?
|
-
|
+
|
+
|
?
|
-
|
?
|
Shafiee et al. 201330
|
-
|
?
|
-
|
-
|
+
|
?
|
+
|
-
|
?
|
Wiley and Carrington 201631
|
-
|
?
|
?
|
-
|
+
|
+
|
?
|
-
|
?
|
Tan et al. 201632
|
?
|
+
|
?
|
-
|
+
|
?
|
?
|
-
|
?
|
Zhang et al. 201533
|
?
|
?
|
?
|
-
|
+
|
+
|
?
|
-
|
?
|
Steinberg et al. 201434
|
-
|
?
|
-
|
-
|
+
|
+
|
+
|
-
|
+
|
Obokata et al. 201535
|
+
|
+
|
?
|
-
|
+
|
+
|
?
|
-
|
?
|
Je et al. 201736
|
-
|
?
|
-
|
-
|
+
|
+
|
?
|
-
|
?
|
Misra et al. 200837
|
-
|
?
|
-
|
-
|
+
|
-
|
?
|
-
|
?
|
Wilkerson et al. 201038
|
+
|
+
|
-
|
-
|
+
|
+
|
?
|
-
|
?
|
Efstathiou et al. 201239
|
+
|
?
|
?
|
-
|
+
|
+
|
+
|
-
|
+
|
Eisenmann et al. 201040
|
-
|
?
|
-
|
-
|
+
|
?
|
+
|
-
|
?
|
Zou et al. 201841
|
+
|
-
|
-
|
-
|
+
|
+
|
+
|
-
|
+
|
Liu et al. 201442
|
-
|
?
|
-
|
-
|
+
|
?
|
+
|
-
|
?
|
Kakudi et al. 201743
|
+
|
?
|
?
|
-
|
+
|
+
|
+
|
-
|
+
|
Sancar & Tinzali 201644
|
-
|
?
|
-
|
?
|
+
|
?
|
+
|
-
|
?
|
Andersen et al. 201545
|
+
|
+
|
?
|
?
|
+
|
+
|
?
|
+
|
?
|
Shi et al. 201546
|
+
|
+
|
-
|
?
|
+
|
+
|
?
|
-
|
?
|
Hsiao et al. 200947
|
?
|
?
|
-
|
-
|
+
|
+
|
+
|
-
|
?
|
*ROB = risk of bias. (+) shows low ROB/low concern regarding applicability; (-) shows high ROB/ high concern regarding applicability; and (?) shows unclear ROB/ unclear concern regarding applicability. |
Table 1.0 above provided a summary of the quality assessment of the included studies.
In summary, the quality assessment revealed that in the entire included studies there is moderate-to-high risk of bias, primarily due to the use of inappropriate study design and absence of external validation. Further look at the studies, it was observed that majority of the models suffered a high risk of bias and significant methodological deficiencies arising from poor choice of model analyses, significantly underpowered analyses, dichotomisation of continuous variables, lack of adjustment for optimisation, poor handling of missing data and overall poor model presentation.
Prioritising/ ranking models or risk scores
The number of papers and risk models or scores included in the final sample of this review is relatively high. Therefore, for clarity, it was decided to highlight the risk models or scores with the most potential to be useful to end users, i.e. practitioners, policymakers or laypersons. Furthermore, for any prediction model or risk score to be considered useful, it should be accurate (statistically significant calibration, and discrimination above 0.70), generalisable (externally validated by a separate research team on a different population) and usable (has few components that are commonly used in practical setting) [23]. However, MetS prediction discipline is arguably still in its early phase of development; therefore, it is difficult to identify any model or score that fulfils all of the above criteria. Hence, to prioritise risk models or scores in this study, we developed pragmatic criteria by modifying the criteria set by Altman et al. [23]. A similar approach was used by Nobel et al. [20].
Studies were favoured if they used prospective/cohort data to develop their model, they reported discrimination above 0.70 and or calibration, and has few components that are commonly used in a practical setting. The three prioritised risk models or scores are summarised in an easily accessible table (see table 2.0 below).
Table 2.0
Components of three MetS risk models or scores with potential for adaption for use in routine practice
SCORE/STUDY, COUNTRY, AUTHOR/YEAR
|
COMPONENTS OF THE RISK SCORE
|
AUROC
|
CALIBRATION
|
EXTERNAL VALIDATION
|
Year, Country
|
AUROC
|
CALIBRATION
|
Greece, Efstathiou et al. 201239
|
Birth weight, Birth head circumference, Parental overweight/obesity
|
0.97
|
6.64
|
2012, Greece
|
0.97
|
Not stated
|
Japan, Obokata et al. 201535
|
WC, TG, HDL, GLU, BP, age (> 47yrs), uric acid, female gender, gamma GT
|
0.80
|
0.12d, 0.48v
|
Not externally validated
|
Taiwan, Hsiao et al.
200947
|
TG, WC
|
0.76
|
0.204
|
Not externally validated
|
BP= blood pressure, HDL= high density cholesterol WC= waist circumference, TG/TAG= triglyceride, FBG= fasting blood glucose |