Impact of white-mater mask selection on DTI histogram-based metrics as potential biomarkers in cerebral small vessel disease

Purpose Histogram-based metrics extracted from diffusion-tensor imaging (DTI) have been suggested as potential biomarkers for cerebral small vessel disease (SVD), but methods and results have varied across studies. This work aims to assess the impact of mask selection for extracting histogram-based metrics of fractional anisotropy (FA) and mean diffusivity (MD) on their sensitivity as SVD biomarkers. Methods DTI data were collected from 17 SVD patients and 12 healthy controls. For each participant, FA and MD maps were estimated; from these, histograms were computed on two alternative whole-brain white-matter masks: normal-appearing white-matter (NAWM) and mean FA tract skeleton (TBSS). Histogram-based metrics (median, peak height, peak width, peak value) were extracted from the FA and MD maps. These were compared between patients and controls, and correlated with the patients’ cognitive scores (executive function and processing speed). Results White matter mask selection signicantly impacted FA and MD histogram metrics and affected their ability to discriminate between groups. Moreover, we observed that the mask can inuence the correlations with cognitive measures. Nevertheless, the MD peak height and MD peak width metrics remained signicantly correlated with executive function, regardless of the mask. Conclusion Our results corroborate previous reports and further support the value of DTI histogram-based metrics as SVD biomarkers. However, they also highlight the importance of the processing methodology, in particular the choice of white matter mask, as hence the urgent need to mitigate the lack of standardized MRI data-processing pipelines. with 1.7×1.7×5.2 mm 3 resolution, 3 repetitions of gradients along 20 directions with b = 1000 s/mm 2 and b = 0 s/mm 2 image, acquisition time (TA) = 5:18 min.; (ii) Magnetization-Prepared Rapid Gradient-Echo (MPRAGE)-T 1 -weighted imaging (T1WI), with TR/TE/TI = 2250/2.26/900 ms, ip angle = 9º, and 144 contiguous slices with 1 mm 3 isotropic resolution, TA = 9:56 min.; (iii) Fluid-attenuated inversion recovery (FLAIR)-T2-weighted imaging (T2WI), with TR/TE/TI = 8500/97/2500ms, and 45 contiguous slices with 0.9×0.7×3 mm 3 resolution, TA = 4:17 min.; and (iv) Susceptibility-weighted imaging (SWI) using a gradient-echo sequence, with TR/TE = 28/20; ip angle = 15º, 96 slices with 0.7×0.6×1.4 mm 3 resolution and TA = 4:49min.


Introduction
Cerebral small vessel disease (SVD) is associated with progressive cognitive impairment and is one of the leading causes of dementia among the elderly. SVD is associated with the pathological mechanisms affecting the small vessels of the brain (i.e. 5µm-2mm in diameter) [1]. Several magnetic resonance imaging (MRI) modalities have been used to identify neuroimaging features that may provide potential disease biomarkers, useful for early diagnosis, monitoring disease progression, and evaluation of new therapies [2]. Biomarkers based on conventional structural MRI have been extensively studied [3][4][5][6].
Unfortunately, most of these features are only detectable at advanced stages of the disease and only poorly correlate with cognitive performance [2], which is typically dominated by de cits in executive function and processing speed [7].
Consequently, other MRI techniques targeting earlier pathological changes have been explored. Remarkably, diffusion-weighted imaging (DWI) methods have been found to be sensitive to white matter (WM) microstructural abnormalities in SVD patients, not only in areas already affected by lesions, but also in the normal-appearing white matter (NAWM) [8]. In particular, fractional anisotropy (FA) and mean diffusivity (MD) parameters derived from diffusion-tensor imaging (DTI) have been most commonly used. Furthermore, histogram analysis has been preferably employed in SVD studies to better describe the diffuse nature of white matter pathology in this disease [9]. Histogram analysis enables the assessment of microstructural abnormalities distributed across the whole-brain, in contrast to region-ofinterest (ROI) analyses which provide measures of localized changes in a given area or structure.
Moreover, a previous study has shown that histogram measures are more reproducible [10]. However, the choice of the WM mask used to compute the histograms has differed substantially between studies. The majority of prior research has employed WM/NAWM masks or tract-based spatial statistics (TBSS) WM skeleton masks. While different results are expected when these distinct masks are used, the impact of these differences on the detection of pathology is not easily predictable. Most critically, this methodological variability may at least partially explain the inconsistencies between studies regarding the best predictor of cognitive impairment [7], given that the choice of WM mask can in uence the correlations with cognitive measures.
For example, previous literature suggested the MD peak height computed over the WM mask as a sensitive biomarker to predict progressive decline in executive function in a sample of sporadic SVD (sSVD) patients [11,12]. In contrast, Nitkunan et al. considered a mask including all visible lesions as well as voxels for which the probability of being grey matter (GM) or WM was higher than 0.5, and reported that the FA peak height was more strongly associated with executive function than other FA or MD histogram-based metrics [13]. Moreover, Croall et al. reported the strongest predictor of executive dysfunction to be the FA median when testing both WM and NAWM masks [14]. On the other hand, Baykara et al. [10] reported that the strongest predictor of processing speed was the MD peak width evaluated on a TBSS skeleton built to compare across different SVD patient groups [15]. Conversely, Lawrence et al. demonstrated that the MD peak height obtained over the NAWM voxels was strongly associated with processing speed impairment [16]. Nonetheless, in recent work, the MD peak width has been utilized as a reference measure to validate the predictive power of novel biomarkers due to its wellestablished value in detecting microscopic tissue changes in a broad range of pathological WM conditions [10,[17][18][19][20]; as a result, retrospective validation in existing DWI datasets may help to establish a consistent processing pipeline for future longitudinal studies Our study aimed to assess the impact of using different masks for computing DTI histogram-based metrics on their sensitivity as SVD biomarkers. For this purpose: 1) we extracted different DTI histogrambased metrics using two distinct masks: NAWM versus TBSS; 2) we evaluated their sensitivity to discriminate between patients and healthy controls; and 3) we performed a correlation analysis between a selection of DTI metrics that have been most frequently considered in SVD studies (i.e, FA median, FA peak height, MD peak height, and MD peak width) and the cognitive domains more typically impaired in SVD patients (i.e. processing speed and executive function).

Study Population
The study was approved by the local Ethics Committee, and all subjects gave written informed consent in accordance with the Declaration of Helsinki.
A total of N = 17 patients were recruited including two different groups of SVD patients: Cerebral Autosomal Dominant Arteriopathy With Subcortical Infarcts and Leukoencephalopathy (CADASIL) patients (N = 6) and sporadic SVD (sSVD) patients (N = 11). A healthy control (HC) group was also recruited consisting of N = 12 age-matched healthy volunteers, with no known relevant medical history, namely SVD or dementia. The patients' inclusion criteria were: 1) independence in daily activities, as assessed by the Instrumental Activities of Daily Living (IADL) scale; absence of hemodynamically signi cant large vessel disease, as evaluated by Doppler ultrasound (carotid and vertebral); 2) for sSVD group: evidence of SVD in MRI (deep WMH lesions, with no other plausible explanation, with moderate and severe degrees, II/III or III/III, according to the Fazekas scale [21]; 3) for CADASIL group: symptomatic patients with evidence of WMH lesions and molecular diagnosis con rmed by mutation on the NOTCH3 gene. Exclusion criteria were: 1) contraindications for MRI; 2) for the patients: evidence of WMH lesions from other known aetiology based on clinical and laboratory ndings; presence of concurrent chronic incapacitating disorders; stroke in the past three months; illiteracy; and compromised visual acuity. All patients were examined by an experienced neurologist in order to exclude other neurological or neuropsychiatric diseases and to perform neurological examinations. Demographic data and cerebrovascular risk factors (including history of previous stroke, hypertension, blood pressure, hypercholesterolaemia, diabetes and smoking) were recorded. The cohort demographics and clinical data are presented in Table S1 (supplementary material). At another session, cognitive functions were evaluated by an experienced neuropsychologist. Finally, all participants underwent MRI scanning, according to a standardized protocol.

Neuropsychological Evaluation
To evaluate cognitive functions, patients underwent a comprehensive battery of neuropsychological tests as close to the MRI scanning as possible (within 15 days). One patient was excluded from the statistical analysis due to unavailability to perform the neuropsychological evaluation. Global functioning was assessed by the Montreal Cognitive Assessment (MoCA) Test, and speci c cognitive domains of interest (processing speed and executive function) were evaluated by Stroop test and Trail Making Test (TMT). We considered either speci c scores of the appropriate cognitive tests or composite scores, as shown in Table S2 (supplementary material). For each test, we generated a cognitive z-score; we converted the Stroop Test percentile scores into z-scores considering the available normative data for the studied population [22]. In the case of the TMT, we used a publicly available database to calculate the z-scores directly [23]. Then, composite scores were generated based on previous reports to represent: (1) executive function: average of the z-scores of Stroop Test interference and TMT subtracting part A from part B (TMTB-A) [24,25]; and (2) processing speed: average of the z-scores of TMT part A and part B [10].

MRI Acquisition
Whole-brain images were acquired on a 3T Siemens Verio scanner using a 12-channel radio-frequency receive head coil including: (i) DWI-Echo-Planar Imaging (EPI), with TR/TE = 4800/107 ms, 25 contiguous The main steps of the processing pipeline (see details below) followed in this work for the extraction of both conventional structural imaging metrics and DTI histogram-based metrics are depicted in Fig. 1.

Conventional structural imaging
For comparison with DTI derived metrics, the following structural measures were obtained from conventional structural imaging: Normalised brain volume (NBV), which is expected to provide a measure of brain atrophy, was automatically estimated from MPRAGE-T1WI with FSL's SIENAX [26], using the white-matter hyperintensities (WMH) lesion masks as an input (details below) to minimize GM misclassi cation. The total brain volume was considered as the sum of GM and WM volumes and was normalised by the intracranial volume to obtain NBV. Additionally, brain tissue masks (GM, WM and cerebrospinal uid (CSF)) were obtained by segmentation of the MPRAGE image with FSL's FAST [27]. Segmentation of subcortical structures was also performed with FSL's FIRST [28], and their spurious contributions were removed to obtain a corrected WM mask.
WMH lesions, cerebral microbleeds and lacunes were identi ed by an experienced neuroradiologist. WMH lesions were manually segmented considering regions of increased signal on FLAIR-T2WI (using fslview from version 5.0.9 of FSL). Normalised WMH lesion load (NWMHLL) was then estimated as the percentage of WMH lesion volume relative to the whole-brain NBV. We transformed the lesion masks to MPRAGE space by performing linear registrations between the two types of images using FSL's FLIRT [29,30]; linear interpolation was used in the transformation; in order to obtain a binary mask, the registered image was thresholded at 0.5 and binarized. The number of cerebral microbleeds (nCMB) was obtained according to the Microbleed Anatomical Rating Scale (MARS) criterion [31] based on SWI. The number of lacunes (nLac) was obtained by quantifying the areas of tissue loss with cavitation on MPRAGE-T1WI with 0.3-1.5 cm of diameter [32].

Diffusion-tensor imaging
Firstly, DWI data was corrected for eddy-current distortions and motion with FSL's eddy tool [33] using the outliers replacement option (repol [34]). To obtain FA and MD maps from the corrected DWI data, we performed tensor-tting using FSL's dti t tool [35,36]. Then, we evaluated two alternative WM masks for deriving the histogram-based measures of FA and MD: TBSS versus NAWM.
To generate the TBSS mask, we applied the TBSS pipeline from FSL [15] to both the FA and MD maps by rst performing erosion of FA maps to remove outliers contributions from tensor tting. Then, FA maps were non-linearly aligned to an FA template (FMRIB58_FA_1mm) in the MNI space [37] using linear interpolation (ANTs tools, http://stnava.github.io/ANTs/). Subsequently, from the FA maps, we derived a mask representing the most common WM tracts across the subjects group mean WM skeleton, which was thresholded at 0.3. This threshold was chosen after evaluating two potential values: 0.2 (FSL's default) and 0.3 [10,38]. MD values were also projected onto the previously derived mean WM skeleton by applying an adapted version of the FSL's tbss_non_FA script (as previously indicated, non-linear registration was also performed using ANTs tools.).
To de ne the NAWM mask, we subtracted the WMH mask registered into the MPRAGE space from the total corrected WM mask. Then, the NAWM mask was transformed into the DWI native space (where the FA and MD maps are originally computed). For this purpose, an a ne transformation between the DWI (b = 0 image) and MPRAGE space was estimated using FSL's FLIRT; and subsequently used to transform the NAWM mask (MPRAGE space), using linear interpolation.
Normalized histograms (divided by the total number of voxels and bin width) of both FA and MD maps were computed in R (www.r-project.org). FA range was de ned between 0 and 1. For the MD range, we considered as the upper limit the maximum value of the 99th percentile across subjects, intending to exclude CSF voxel contributions (using this criterion, the MD upper limit was 0.0025 mm²s ¹). The optimal number of bins was estimated using the Freedman-Diaconis rule [39], for each mask. To enable comparisons with the literature, the metrics more frequently extracted from both the FA and MD histograms were selected based on previous reports published between 2008-2018 [8]: median, peak height, peak width (difference between the 5th and 95th percentiles) and peak value.

Statistical Analysis
Statistical analysis was performed in R (www.r-project.org). All metrics were rst transformed to z-scores. Clinical data and neuroimaging features extracted from conventional structural MRI were compared between groups using independent T-tests, Mann-Whitney tests, or chi-square tests, as appropriate. For the analysis of the patients' cognitive pro le, we performed a Wilcoxon signed-rank test between each neuropsychological score and the reference performance expected for the healthy population (z-score = 0). The impact of each mask on the ability of the FA and MD histogram-based metrics to discriminate between patients and controls was assessed by performing a 2-way repeated-measures ANOVA with factors: between-subjects factor Group (SVD and HC); and within-subjects factor Mask (TBSS and NAWM). When interactions between Group and Method and/or Mask were identi ed as being statistically signi cant, a post-hoc analysis was performed. Finally, a partial Spearman correlation analysis was employed to estimate the power to predict either executive function or processing speed (corrected for the effect of age and sex), by using each of the four selected metrics (FA median, FA peak height, MD peak height, and MD peak width), generated using each mask. Bonferroni correction was used to adjust the pvalues in order to correct for multiple comparisons, and signi cant effects were considered for corrected p < 0.05.

Neuropsychological evaluation
The cognitive pro le across the group of patients is presented in Fig. 2. Performance was signi cantly impaired only for TMTA and TMTB cognitive scores. According to the respective composite scores, only processing speed was signi cantly impaired in this patient group, compared to the expected performance for a healthy population (p < 0.01). Nevertheless, executive function performance also showed considerable variability across patients. Figure 2 shows an illustrative example of WMH lesions, cerebral microbleeds and lacunes. Table 1 summarises the structural imaging metrics extracted from MPRAGE-T1WI, FLAIR-T2WI and SWI. NBV was marginally higher in SVD patients compared to healthy controls (p = 0.034). However, no signi cant differences between patient groups were found (p = 0.098). As expected, patients showed a higher NWMHLL compared to controls (p < 0.001). Regarding nCMB and nLac, and comparing with previous reports, this cohort showed a lower lesion burden. Speci cally, nCMB presented an average value of 1.  [12]. For this reason, we did not include these measures in further analyses.  Fig. 4 for one representative subject. The FA and MD histograms extracted from each type of mask, together with the respective histogram-based metrics (median, peak height, peak width, and peak value), are illustrated in Fig. 5. Histograms of all subjects are depicted in Figures S1 and S2 (supplementary material).

Conventional structural imaging metrics
Boxplots showing the distributions across groups of all FA and MD histogram metrics, obtained using each mask are displayed in Fig. 6. Results suggest that changes in FA histogram are generally characterized by reduced FA in patients compared to healthy controls as evidenced by lower median and peak values. On the contrary, MD histograms generally showed higher median and peak values in patients.Boxplots showing the distributions across groups of all FA and MD histogram metrics, obtained using each mask, are displayed in Fig. 6. Results suggest that changes in FA histograms are generally characterized by reduced FA in patients compared to healthy controls as evidenced by lower median and peak values. On the contrary, MD histograms generally showed higher median and peak values in patients.
A two-way mixed factor ANOVA was used to examine the effect of the factor Mask, as well as the subject Group, on the DTI histogram metrics. Table 2 summarizes the major results of this analysis. Overall, there was a signi cant main effect of Mask for all FA metrics (p < 0.001) and MD peak height (p < 0.001). The post-hoc tests showed that, irrespective of group, the two masks yielded signi cantly different results; furthermore, as seen from the boxplots (Fig. 6), FA histogram-derived metrics are signi cantly higher for TBSS relative to NAWM. Signi cant main effects of Group were found only for the following metrics: FA median (p = 0.008); FA peak height (p = 0.041); FA peak value (p = 0.004); MD peak height (p = 0.048); and MD peak width (p = 0.024). Signi cant interactions were found between Mask and Group for: FA peak height (p = 0.027); MD median (p = 0.035); and MD peak width (p = 0.047). Post-hoc analysis revealed that: (1) when using either mask, we observed no signi cant differences in FA peak height and MD median between groups; however, signi cant differences were found for MD peak width with TBSS (p = 0.037) but not NAWM (p = 0.690).

Discussion
This study evaluated the effect of using different whole-brain WM masks on the potential of DTI histogram-based metrics as biomarkers in a group of patients with cerebral SVD. We found that the choice of mask signi cantly impacted the FA and MD histogram metrics considered. Most critically, in some cases, they affected the ability of the metrics to discriminate between patients and controls. Moreover, we found that the mask also affected the relationship between DTI metrics and cognitive performance in tests of executive function and processing speed.

Impact of mask selection on DTI metrics
Our study shows that the WM mask used to generate the FA and MD histograms has a signi cant impact on the values of the respective metrics, in some cases in uencing their ability to discriminate between groups. In fact, prior studies have varied substantially regarding the mask used to compute FA and MD histograms, which have included the TBSS skeleton, WM, NAWM and WMH. The choice of the mask is critical as it determines the whole-brain region that is probed for pathological microstructural changes underlying the disease. Previous studies reported weak correlations between DTI parameters within lesion areas and cognitive impairment [24,38], suggesting that WMH regions are not representative of early brain impairment. Moreover, WMH regions also typically include relatively few voxels rendering them less statistically valid than larger regions. For these reasons, we decided to restrict our analysis to TBSS and NAWM masks, which exclude WMH regions. By using a NAWM mask, we aimed to detect microstructural changes that are not yet manifested on T1WI or T2WI. However, NAWM masks are very likely to be contaminated by partial volume effects after transformation between structural and diffusion imaging spaces. In contrast, TBSS is an automatic technique that overcomes this issue by restricting the analysis to the centre of major WM tracts, the mean FA skeleton identi ed based on the highest FA values [10]. All FA metrics were more sensitive to mask selection than MD metrics (signi cant main effect of mask). This is expected given that MD values are substantially more uniform throughout the brain (with very minor differences between GM and WM and hence less impact from partial volume effects) than FA values. On the other hand, our results indicate that the ability of the metrics FA median, FA peak value, FA peak height, MD peak width and MD peak height to discriminate between patients and controls appears to be robust to the mask employed (signi cant main effect of Group). Nevertheless, FA peak height and MD peak width also showed a signi cant interaction between Group and Mask. Interestingly, post-hoc analysis revealed that MD peak width was able to differentiate between groups, but only when using the TBSS mask and the NAWM mask. These ndings suggest that some of these metrics may be preferentially used as biomarkers in cerebral SVD; however, their association with cognitive variables may also in uence their relevance as biomarkers.

Value of selected DTI metrics for predicting cognitive impairment
We found a signi cant correlation between executive function and the metrics MD peak height and MD peak width, regardless of the mask used. On the other hand, the correlation between executive function and FA Median only survived multiple comparison correction when using the NAWM mask. Although not signi cant after correction, a correlation was also found between processing speed (as measured by the respective composite score, TMTA + TMTB) and MD peak width. This result is consistent with the in uential study by Baykara et al. [10], where MD peak width was consistently correlated with processing speed across multiple samples of patients. As in our study, the processing speed function of the CADASIL patients was assessed with the same composite score (TMTA + TMTB). However, unlike CADASIL, their sSVD sample was evaluated with a combination of 1-letter subtask of the Paper-Pencil Memory Scanning Test and Letter-Digit Substitution Task tests. Moreover, the authors reported that the association was weaker for the sSVD group. The discrepancy between tests used to assess processing speed, may partly explain the relatively weak association found in our sample, which is mostly composed of sSVD patients.
Importantly, our methodology (0.3 skeleton threshold optimized to avoid partial volume effects; and linear interpolation, TBSS mask) bears a close resemblance to the one employed by Baykara et al [10]; the only difference is that they performed an additional step to remove regions close to the skeleton likely to be contaminated with CSF, by using a custom-made mask. We omitted this step as it would introduce operator variability. In summary, our results strongly support the MD peak width as a biomarker of SVD, as in Baykara, by further demonstrating a robust association with executive function.
Similarly to the MD peak width, MD peak height also showed signi cant positive association with executive function regardless of the mask. The MD peak height is the only DTI metric that has previously been reported to be correlated both with executive function and processing speed (from the same longitudinal study: Lawrence et al. [16]; Zeestraten et al. [11,12]). An equivalent processing approach was also initially implemented to baseline data of their longitudinal study: NAWM mask by Lawrence et al. [16]. Nonetheless, our results regarding MD peak height were not correlated with processing speed which might be once more related to differences in the cognitive measures used in the two studies. The most recent reports from their group demonstrated that including both NAWM and WMH regions in the WM mask produces larger changes over time in the DTI parameters, compared to including only NAWM; the observed changes were strongly correlated with cognitive decline of executive function. However, the prominent contamination by CSF of non-TBSS based masks should be considered carefully [40].
Regarding FA peak height, we found no correlation with cognitive measures, in contrast with a previous study [13]. One explanation might be the different cognitive measures used as representative of executive function, which partly differ from ours: digit span backwards (from the Wechsler Memory scale III), verbal uency (from the Delis-Kaplan Executive Function system) and TMTB. In addition, our sample is smaller and less homogenous than the one assessed by the above-mentioned study. Indeed, including CADASIL patients might have introduced a higher variability in DTI parameters as well as in cognitive decline, and thereby in uence the respective correlations. On the other hand, this discrepancy is also likely to re ect the impact of using different processing strategies. In fact, in contrast with our methodology, their study included GM regions while we only included WM regions (TBSS vs NAWM). It remains unclear, however, how the inclusion of more diffusion-isotropic areas (such as GM) in uences the correlation with cognitive measures. Thus, our results indicate a lower stability of FA peak height as a biomarker, not only due to the lack of sensitivity to distinguish between patients and controls but also due to the absence of correlations with cognition. Furthermore, a prior longitudinal study found that reductions in WM microstructural integrity did not result in substantial FA peak height changes over a three-year period [41]; which also suggests the lack of sensitivity of this metric to patients' cognitive decline.
Overall, our results revealed a greater sensitivity and stability of MD compared to FA measures in terms of their potential to correlate with cognitive performance of SVD patients, which is in line with previous reports [11,12,16].

Limitations
Some limitations of this study should be addressed in the future. Firstly, a clinical protocol (with 1.7×1.7×5.2 mm 3 resolution) was used to acquire the DTI data, which might not be ideal for the analysis to be performed. The acquisition protocol could be further optimized by using isotropic voxels and sampling a higher number of gradient directions instead of multiple repeats of the same set of 20 directions. A further improvement would be the additional acquisition of a non-DWI scan with opposing phase-encode direction, in order to allow the correction of geometric distortions due to B0 inhomogeneities using FSL's TOPUP tool [42]. However, some of these optimizations might imply longer acquisition protocols, unless more sophisticated sampling schemes are used [43]. On the other hand, our results indicate that clinical MRI protocols might provide su cient information for the extraction of DTI biomarkers of cerebral SVD. Another important limitation is the small number of subjects in our sample, which constrained the statistical power of the analysis. Moreover, the limited availability of normative data for the studied population limited the selection of neuropsychological tests that can be used to re ect each cognitive domain. In addition, this cohort showed a lower lesion burden compared to previous reports. This might potentially limit the generalizability of the ndings. Although FA and MD measures are indicative of altered tissue integrity (i.e. related to axonal degeneration and ischemic demyelination), they lack speci city regarding the concrete nature of the underlying histological changes. More sophisticated microscopic diffusion models have been developed to better inform the true nature of these changes and may be explored in future studies of SVD patients [9,20]. In this regard, our results represent the rst step towards a consensus analysis pipeline to generate comparable DTI histogram metrics capable of predicting cognitive impairment in SVD patients; this pipeline could be potentially applicable also to other clinical conditions such as multiple sclerosis [17].

Conclusion
In summary, our results support the hypothesis that FA and MD histogram-based metrics extracted from DTI are sensitive to differences between SVD patients and controls; and can also predict cognitive impairment in SVD patients. Importantly, our results emphasize the importance of properly selecting the mask used to extract the histograms. Overall, our ndings support prior reports that the MD peak width has the best potential for correlation with both executive function and processing speed and should be used as a reference measure for the validation of novel biomarkers. In contrast, other DTI metrics are more dependent on the underlying processing options. Finally, we recommend that the TBSS method be utilized to de ne the whole-brain WM mask, since it is a completely automated methodology that reduces CSF contamination when compared to non-skeletonized masks (NAWM or WM), especially when analyzing non-optimized clinical datasets. Overall, our results extend previous reports and further support the value of DTI histogram-based metrics as SVD biomarkers; but they also clearly highlight the importance of the processing methodology used, as well as the urgent need to mitigate the lack of standardized MRI data-processing pipelines.

Declarations
Competing interests: The authors declare no competing interests.
Author's Contributions ARF -acquisition of data, analysis and interpretation of data, drafting of manuscript, and critical revision; RGN -study conception and design, analysis and interpretation of data, drafting of manuscript, and critical revision; JP -acquisition of data, analysis and interpretation of data, and critical revision; LA -patient recruitment, acquisition of data, and critical revision; SC -acquisition of data; CG -acquisition of data; MR -acquisition of data, and critical revision; MVB -patient recruitment, study conception and design, and critical revision; PV -study conception and design, analysis and interpretation of data, and critical revision; PF -study conception and design, analysis and interpretation of data, drafting of manuscript, and critical revision.  Patients' cognitive pro le. The group median age-scaled z-score is represented for each considered cognitive test (i.e., Stroop, TMTA, TMTB and TMTB-A, in black) and for the composite scores resulting from the combination of Stroop+TMTB-A (executive function) and TMTA+TMTB (processing speed), in red. Individual scores re ecting processing speed (PS) are labelled in blue and those re ecting executive functions (EF) are labelled in green. Bars represent the BCa (bias-corrected accelerated) Bootstrap con dence interval at 95%. The dashed line (z-score=0) represents the expected performance for a healthy population. Signi cant impairment is represented by **(p<0.01) and *(0.01<p<0.05). Performance is signi cantly impaired in TMTA and TMTB. In general, processing speed is signi cantly impaired, but executive function is not.  Illustrative examples of FA and MD histograms obtained using the different masks: TBSS (left) and NAWM (right), demonstrating the extraction of the histogram-based metrics (median, peak height, peak value and peak width).

Figure 6
Boxplots showing the distributions across healthy subjects (HC) and SVD patients (SVD) of FA and MD histogram metrics (median, peak height, peak width and peak value), obtained using each mask (TBSS and NAWM). Partial correlation analysis between DTI histogram-based metrics (FA median and peak height, MD peak height and peak width) and cognitive measures representing executive function and processing speed, while correcting for differences in age and sex. Signi cant correlations after correction for multiple comparisons are highlighted by boxes with dashed black borders.