Design and methods used for this protocol comply with Centre for Reviews and Dissemination (CRD’s) Guidance For Undertaking Reviews in Healthcare  and is reported in line with Preferred Reporting Items for Systematic Reviews and Meta-Analyses Protocols (PRISMA-P) . Eligibility criteria were informed using the PICOTS  system.
(P) Population: adult patients diagnosed with Relapsing-Remitting Multiple Sclerosis (PRMS) or Secondary Progressive Multiple Sclerosis (SPMS), based on the McDonald criteria [2, 16-18]:
- Two attacks or symptom flare-ups (lasting at least 24 hours with 30 days between attacks), plus two lesions.
- Two attacks, one lesion, and evidence of dissemination in space (or a different attack in a different part of the nervous system).
- One attack, two lesions, and evidence of dissemination in time (or finding a new lesion — in the same location — since the previous scan, or presence of immunoglobulin, called oligoclonal bands in the spinal fluid).
- One attack, one lesion, and evidence of dissemination in space and time.
- Worsening of symptoms or lesions and dissemination in space found in two of the following: MRI of the brain, MRI of the spine, and spinal fluid.
- Relapsing-Remitting course: A multiple sclerosis course characterized by relapses with stable neurological disability between episodes.
- Progressive course: A multiple sclerosis course characterized by steadily increasing objectively documented neurological disability independent of relapses. Fluctuations, periods of stability, and superimposed relapses might occur. Primary progressive multiple sclerosis (a progressive course from disease onset) and secondary progressive multiple sclerosis (a progressive course following an initial relapsing-remitting course) are distinguished.
(I) Index (Prognostic factor): T1 hypointense (black hole) lesion mean volume (lesion load) on brain Magnetic Resonance Imaging (MRI)
(C) Comparator: not applicable
(O) Outcome: disability measure using Expanded Disability Status Scale (EDSS)
(T) Timing: measured at the same time MRI was performed (or with a very close time interval between)
(S) Setting: any
The search will employ sensitive topic-based strategies designed for each database with no time frame limitations. There will be no language or geographical restrictions either. We will perform our search on the 10th of February, 2021.
- MEDLINE through PubMed
- Science Citation Index – Expanded (Web of Science)
- Conference Proceedings Citation Index – Science (Web of Science)
Our search strategies for all the databases included in our study, namely MEDLINE (through PubMed), Embase, Science Citation Index – Expanded (Web of Science), and Conference Proceedings Citation Index – Science (Web of Science) are presented in appendix A.
Records will be managed through EndNote version X9 ; specific software for managing bibliographies.
Two reviewers (AV and MM) will independently screen the title and abstract of identified studies for inclusion. We will link publications from the same study to avoid including data from the same study more than once. If any study cannot be clearly excluded based on its title and abstract, its full text will be reviewed. A study will be included when both reviewers independently assess it as satisfying the inclusion criteria from the full text. A third reviewer (MF) will act as arbitrator in the event of disagreement following discussion. We will prepare a flow diagram of the number of studies identified and excluded at each stage in accordance with the PRISMA flow diagram of study selection .
Data collection process
Using a standardized form, two reviewers (AV and MM) will extract the data independently. We will resolve any disagreements by discussion or, if required, by consultation with a third review author (MF). We will attempt to extract data presented only in graphs and figures whenever possible but will include such data only if two reviewers independently obtain the same result. If studies are multi-center, then where possible we will extract data relevant to each. In the case of missingness of data, if possible, we will try to contact the original investigators to request missing information. In case that was unsuccessful, we will only analyze the available information and will not try to impute any missing data.
Data extracted will include the following summary data: sample characteristics, sample size, study methods, inclusion and exclusion criteria, MRI settings used, founding sources, declarations of interests, and results.
Outcomes and prioritization
Our main outcome of interest is the relationship between participants’ EDSS score and T1 hypointense lesion mean volume.
Risk of bias in individual studies
Two review authors (AV and MM) will assess the risk of bias of each included study. We will resolve any disagreements by consensus, or by consultation with a third review author (MF). Because at the moment there is no standard tool for assessing the risk of bias in overall prognosis studies, we will use a tailored version of the Quality In Prognosis Studies (QUIPS) tool for assessing the risk of bias in studies , presented in Appendix B. Our tailored version of the tool consists of six risks of bias domains: study participation, study attrition, prognostic factor measurement, outcome measurement, study confounding, and statistical analysis and reporting. The study participation domain consists of five items: an adequate description of the source population or population of interest, adequate description of the baseline study sample, adequate description of the sampling frame and recruitment, adequate description of the period and place of recruitment, and adequate description of inclusion and exclusion criteria. The study attrition domain consists of four items: description of attempts to collect information on participants who dropped out, reasons for loss to follow-up provided, an adequate description of participants lost to follow-up, and no important differences between participants who completed the study and those who did not. The prognostic factor measurement domain consists of two items: provision of clear definition or description of the prognostic factor, and reporting of continuous variables or use of appropriate cut points. The outcome measurement domain consists of three items: provision of a clear definition of the outcome, use of an adequately valid and reliable method of outcome measurement, and use of same method and setting of outcome measurement in all study participants. The study confounding domain consists of the seven items: measurement of all important confounders, provision of clear definitions of the important confounders measured, adequately valid and reliable measurement of all important confounders, use of same method, and setting of confounding measurement in all study participants, appropriate imputation methods used for missing confounders (if applicable), important potential confounders accounted for in the study design, and important potential confounders accounted for in the analysis. The statistical analysis and reporting domain consists of two items: sufficient presentation of data to assess the adequacy of the analytic strategy, and an adequate statistical model for the design of the study.
We will use R version 4  as the software for our data synthesis. We expect correlation coefficients (r) to be our primary outcome measure. Most meta-analysts do not perform syntheses on the correlation coefficient itself because the variance depends strongly on the correlation. Rather, the correlation is converted to the Fisher’s z scale and all analyses are performed using the transformed values . For the meta-analysis of correlation data, we first convert the correlation coefficients to Fisher’s z scale. Then we will calculate the variance and standard error of the Fisher’s z. We will perform a meta-analysis on those values based on the random-effects model. Finally, we will convert back Fisher’s z to correlation coefficient (r) for the sake of presentation.
Assessment of heterogeneity
We expect some heterogeneity between studies because of ethnicity and methodological diversity. We will report the range of the effects of the random-effects meta-analyses using prediction intervals. In a random-effects meta-analysis, the prediction interval reflects the whole distribution of effects across study populations, including the effect expected in a future study [23, 24].
If at least 5 studies are available for each classification of MS in our study (RRMS and SPMS), we will perform a subgroup analysis for each of them.
We plan to perform sensitivity analyses to explore the influence of the following factors:
- Studies at high or unclear risk of bias
- Very long or large studies to establish the extent to which they dominate the results.
To evaluate the risk of reporting bias across studies, a test for funnel plot asymmetry will be conducted. This test examines whether the relationship between estimated effect size and study size is greater than chance . Funnel plots will be generated for visual inspection of potential publication bias. In the presence of publication bias, the plot will be symmetrical at the top, and data points will increasingly be missing from the middle to the bottom parts of the plot .
Confidence in cumulative evidence
The strength of the overall body of evidence will be assessed using an adapted version of the Grading of Recommendations, Assessment, Development and Evaluation (GRADE) framework for prognostic factor research , which takes into account seven criteria: study limitations, inconsistency, indirectness, imprecision, publication bias, moderate/large effect size, and dose effect. Two review authors (AV and MM) rate the certainty of the evidence for the outcome as 'high', 'moderate', 'low', or 'very low'. We resolve any discrepancies by consensus, or, if needed, by arbitration by a third review author (MF).