This retrospective study was approved by our institutional review board (The Catholic University of Korea Seoul St. Mary's Hospital). The requirement for informed consent was waived as we used a publicly available dataset for this study. The methods and reporting of results are in accordance with the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines.
2.1 Study Population
dPVS were segmented using structural images and demographic information obtained from the WU-Minn Human Connectome Project dataset which enrolled healthy adult twins and NT siblings between the ages of 22 to 35 to identify relationships between brain circuits, genetics and behavior 15.
Of 1206 subjects included in the March 2017 (S1200) release, 1113 subjects who underwent 3T MRI to obtain 0.7 mm isotropic 3D T1- and T2-weighted images were initially enrolled. The exclusion criteria were as follows: high blood pressure, diabetes mellitus or significant cardiovascular disease; severe neurodevelopmental, neurological or documented neuropsychiatric disorders; zygosity not examined by genotyping; NT siblings without respective pairs; and birth before 34th weeks of gestation for twins and before 37 weeks of gestation for non-twins. More information on recruitment and the inclusion and exclusion criteria of the Human Connectome Project was described in a previous study 15
2.2 Imaging acquisition
All MRI data were acquired using a 3T MR scanner (MAGNETOM Skyra CONNECTOM, Siemens Healthcare) customized with a 100mT/m gradient coil, inner bore diameter of 56 cm, and a standard 32-channel head coil at Washington University in St. Louis, MO, USA.
The 3D T1-weighted Magnetization-Prepared Rapid Acquisition with Gradient Echo (MPRAGE) sequence was performed using the following parameters: sagittal acquisition with repetition time, 2400 ms; echo time, 2.14 ms; field of view, 224 × 224 × 180 mm; voxel size, 0.7 × 0.7 × 0.7 mm3; inversion time, 1000 ms; band width, 210 Hz/pixel; flip angle, 8°; GeneRalized Autocalibrating Partial Parallel Acquisition (GRAPPA) factor, 2; and total acquisition time, 7 min 40s.
The 3D T2-weighted Sampling Perfection with Application-optimized Contrasts using different flip angle Evolution (SPACE) sequence was performed using the following parameters: sagittal acquisition with repetition time, 3200 ms; echo time, 565 ms; echo spacing, 3.53 ms; turbo factor, 314; echo train duration , 1105 ms; field of view, 224 × 224 × 180 mm; voxel size = 0.7 × 0.7 × 0.7 mm3; band width, 744 Hz/pixel; variable flip angle; GRAPPA factor, 2; and total acquisition time, 8 min 24s.
More details on the imaging protocols are described in the WU-Minn Human Connectome Project S1200 Release Reference Manual 15
2.3 Spatial similarity assessment
The segmentation of dPVS was a fully automated process that was described in detail in a previous study 14. It entailed the extraction of potential voxels for dPVS using a 3D Frangi filter after signal normalization of 3D T2-weighted images. To reduce false positives outside the brain parenchyma, potential dPVS voxels only inside the BG and WM masks of the Freesurfer segmentation were selected. In addition, we trained and applied a 3D deep convolutional neural network to distinguish dPVS from the false-positive voxels. Based on the final output of the 3D deep learning algorithm, dPVS masks for each BG and WM were obtained.
To compare dPVS locations between pairs, we assessed the similarity of their dPVS images. T1-weighted images on which dPVS had been defined were used to first estimate a deformation field as it was needed for the spatial transformation between each subject’s brain images and the template brain images in SPM12 (https://www.fil.ion.ucl.ac.uk/spm/software/spm12/). The deformation field was then used to transform dPVS images to standard space, and the locations of dPVS in the standard space were compared using the following three similarity indices: mean squared error (MSE), structural similarity (SSIM), and dice similarity (DS). MSE and SSIM have been proposed as metrics for image quality assessment in prior studies. While MSE is a metric simply computed by averaging the squared intensity differences between two images, SSIM is a metric for comparing local patterns of intensities 16. DS is a metric for gauging the similarity of two sets based on their cardinalities 17, and can be applied to two binary images to assess the commonality between them with values ranging between 0 and 1. We compared the locations of dPVS separately for BG and WM.
2.4. Cognitive function assessment
The well-validated NIH Toolbox Cognition Battery was used to assess cognitive function 18. The Battery contains subtests that assess five cognitive domains: executive function (Dimensional Change Card Sort Test [cognitive flexibility], Flanker Inhibitory Control and Attention Test [inhibitory control and attention]), processing speed (Pattern Comparison Processing Speed Test), working memory (List Sorting Working Memory Test), episodic memory (Picture Sequence Memory Test), and language (Picture Vocabulary Test [vocabulary], Oral Reading Recognition Test [reading decoding]).
2.5. Statistical Analysis
After normality tests were performed, age, brain regional and dPVS volumes, and all similarity indices were compared between the three groups using the Kruskal-Wallis test and then the post-hoc Dunn’s test with Bonferroni adjustment. The frequency of sex was compared between the groups using the chi-squared test with Bonferroni adjustment.
As spatial similarity indices could be affected by the total volume of dPVS, we used propensity score matching separately for volumes of BGdPVS and WMdPVS to balance this confounding factor between groups using the nearest matching method with a 1:1:1 ratio 19 The spatial similarity indices were also compared between groups in the matched subjects.
To define genetic influence on the regional location of dPVS, we first divided WM into four (i.e., frontal, parietal, temporal, and occipital) lobar subregions using the Freesurfer results available in the Human Connectome Project dataset. Then, we performed an intraclass correlation (ICC) analysis within twin or NT pairs for dPVS volumes in each of the BG and WM subregions. The ICC for twin data was calculated as:

where MSbetween and MSwithin are the mean-square estimate of between- and within-pair variance, respectively 20.
To assess the clinical implications of dPVS according to location, a correlation analysis was performed between regional dPVS burden and the cognitive function test results.
A P value of <0.05 was considered statistically significant. All statistical analyses were performed using R Statistical Software (version 4.0.3; R Foundation for Statistical Computing, Vienna, Austria).