1. Participants’ enrollment and evaluation
All patients signed informed consent forms in accordance with the approval of the Medical Ethics Committee of the Second Affiliated Hospital of Zhejiang University School of Medicine.
A total of 200 PD patients were initially recruited in this study. The diagnosis of PD was made by a senior neurologist (B.R.Z.) according to the United Kingdom PD Society Brain Bank criteria [19]. Twenty-four patients were excluded since they had: (1) cerebrovascular disorders, including previous stroke, history of head injury, and other neurologic diseases (N=14); (2) cognitive impairment based on the Mini-Mental State Examination (MMSE) estimated by the criteria suitable for Chinese population (MMSE score ≤ 17 for illiterate patients, ≤20 for grade-school literate, and ≤23 for junior high school and higher education literate (N=10)[20, 21]. Finally, 176 PD patients were enrolled in this study, including 49 drug-naïve patients and 127 drug-managed patients. For drug-managed PD patients, clinical assessments were performed in the morning after withdrawing all dopamine replacement therapy overnight (at least 12 hrs., on their “drug-off status”). Basic demographic information, including age, sex, education, and disease duration, and neurologic and psychiatric scales including Unified Parkinson’s Disease Rating Scale (UPDRS) part III score, Hoehn and Yahr stage (H-Y stage), and MMSE score were obtained from all patients. The total levodopa equivalent daily dose (LEDD) [22] and duration of treatment were recorded for drug-managed patients.
2. Image acquisition and preprocessing
2.1 Image acquisition
All imaging data were acquired from a 3.0 Tesla magnetic resonance imaging (MRI) scanner (Discovery MR750; GE Healthcare). MRI scanning for each drug-managed patient was carried out in drug-off status. Head of each participant was stabilized with foam pads, and earplugs were provided to reduce the noise during scanning.
Resting state functional MRI (rs-fMRI) data were acquired using Gradient Recalled Echo - Echo Planar Imaging sequence: TE = 30 ms; TR = 2,000 ms; FA = 77 degrees; FOV = 240 × 240 mm2; matrix = 64 × 64; slice thickness = 4 mm; slice gap = 0 mm; number of slices = 38 (axial); time points = 205. Structural T1-weighted images were acquired using a Fast Spoiled Gradient Recalled sequence: echo time = 3.036 ms; repetition time = 7.336 ms; inversion time = 450 ms; flip angle = 11 degrees; field of view = 260 × 260 mm2; matrix = 256 × 256; slice thickness = 1.2 mm; number of slices = 196 (sagittal). All the sequence FOVs covered the whole brain, including cerebrum, cerebellum and brain stem.
2.2 Image preprocessing
Rs-fMRI data processing was carried out using Statistical Parametric Mapping (SPM 12, https://www.fil.ion.ucl.ac.uk/spm/) and Data Processing Assistant for Resting State fMRI (DPABI_V3.1_180801, http://www.rfmri.org/) [23]. At the beginning, the first 10 volumes of functional time series were deleted letting MRI signal to reach the equilibrium. The remaining images underwent slice timing for interval scanning, realignment, and normalizing to the standard MNI space through T1 images. Next, spatial smoothing with a Gaussian kernel of 6 × 6 × 6 mm full-width-at-half-maximum, detrending, covariates regression (Friston 24-motion parameters) and band-pass temporal filtering (0.01–0.1 Hz) were subsequently applied to the remaining volumes.
2.3 Head-motion control
Considering the effect of head-motion on the rs-fMRI analysis, an additional 14 patients including 2 drug-naïve and 12 drug-managed patients with excessive head-motion ( 2 mm in displacement and 2 degrees in rotation and mean frame-wise displacement (FD) 0.2 mm) were excluded [24, 25]. After that, a total of 162 PD patients were enrolled in this study, including 47 drug-naïve and 115 drug-managed patients. To verify that neither observed nor predicted scores were correlated with head-motion, the correlation coefficients were calculated between the mean FD and observed and predicted score, respectively. To further control the possible head-motion effect, we also applied additional prediction analysis with the mean FD as an additional nuisance variable within the candidate connections selection process described in part 3.1 (candidate connections selection by using LOOCV procedure).
2.4 Functional network construction
Consistent with previous studies applying CPM, network nodes were defined using the 268-ROI functional brain atlas[26]. This atlas covers the whole brain, including cortical, subcortical, and brainstem structures. The whole-brain functional connection matrix was constructed for each patient using GRETNA [27]. The mean time series of each node was extracted by averaging the time series of all voxels in each defined node. And then the functional connection is calculated as the Pearson-correlation-coefficient (r) between the mean time series of each pair of nodes. A Fisher’s r-to-z transformation was then used to normalize the correlation coefficients, and the resulting 268 × 268 matrix for each participant was utilized for the following CPM analysis. Each element of the matrix represented the strength of connection between two nodes.
3. Connectome-based model construction and evaluation in drug-naïve patients
The flowchart of connectome-based model construction and evaluation was shown in figure 1.
3.1 Selection of candidate connections by using LOOCV procedure
Considering the relatively small number of drug-naïve patients (N=47), the leave-one-out cross-validation (LOOCV) procedure was used to select candidate connections [28, 29]. The LOOCV procedure was repeated iteratively. In each iteration, one patient was removed from the training set and data of the remaining N-1 patients was used for testing through the following steps. First, the correlation between the strength of each connection and the observed UPDRS III score was assessed. In this step, recommended spearman’s analysis was applied since the distribution of observed scores in this study did not follow the normal pattern (Kolmogorov-Smirnov test, p<0.05)[30]. Besides, partial correlation analysis was applied to minimize the possible confounding effects of nuisance variables, including age, sex, and disease duration. Next, an optimal threshold was applied to select candidate connections that were significantly correlated with UPDRS III scores (detailed in part 3.4 Optimal threshold exploration). Finally, all selected connections were split into positive connections (connections whose strength indexed higher UPDRS III score and severe motor impairment) and negative connections (connections whose strength indexed lower UPDRS III score and mild motor impairment) according to their correlation coefficients with observed scores. The above-mentioned steps would be repeated for N (N=47) times until all patients had been left out.
3.2 Model construction with consensus connections and predictability evaluation
After LOOCV procedure, forty-seven sets of candidate connections would be obtained. Because of the nature of cross-validation, a slightly different set of candidate connections may be selected in different iterations. To reduce potential variation, the connections that were finally enrolled for model construction should be selected in each iteration, termed consensus connections. These connections had the highest reliability among all candidate connections. Then, the strength of each positive and negative consensus connection was summed, respectively. And strength sums of positive and negative consensus connections were then fitting into general linear regression to build a relationship with the observed score. The predicted score of each patient could be calculated by applying the constructed linear model with the following formula:
𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 score= a1× x1 + a2× x2+𝑏
(x1 : sum of strength of consensus positive connections, x2 : sum of strength of consensus negative connections)
The performance of the constructed model was evaluated by calculating the Spearman correlation coefficient(rtrue) and the mean squared error (MSE) between observed and predicted UPDRS III scores. The value of correlation coefficient and MSE are usually dependent, higher correlation implies lower MSE and vice versa. A lower MSE value means a small difference between predicted and observed score [30] . The significance of the constructed model was further tested by applying the 1000-permutation-test [15]. This test was done by randomly shuffling the UPDRS III score along with repeating the above processes 1000 times. The significance of the permutation test was analyzed by calculating the percentage of sampled permutations that are greater or equal to the rtrue (ppermu), ppermu <0.05 was considered as statistically significant.
3.3 Predictability comparison among models constructed using different methods
Before external validation, it is necessary to ensure that the model (M) constructed within the above processes obtaining the best predictability in drug-naïve patients. Therefore, three models constructed with different methods were employed, including model M1 constructed with consensus positive connections, model M2 constructed with consensus negative connections, and model M3 generating predicted scores from each iteration of LOOCV. The construction processes of three models (M1, M2, and M3) were detailed in supplemental materials (Title: construction processes of M1, M2, and M3). To explore whether the predictability of these models was significantly different from model M, the Steiger’s Z test was applied to compare the rtrue of model M and the other three models [29]. The optimal threshold exploration and permutation test were also applied during the construction of M1, M2, and M3. Thus, the thresholds of these four models can be different.
3.4 Optimal threshold exploration
For candidate connections selection, rather than applying an arbitrary p-value threshold as previous study [29], predictability comparison was employed among different p-values cutoffs. These thresholds were explored by repeating the above process 50 times using p values ranging from 0.05 to 0.001, with an interval of 0.001 each time. The p-value that leads to the highest rtrue was selected and employed in the subsequent model construction.
4. Model validation in drug-managed patients
Finally, the model with the best predictability among the above four ones was further validated in drug-managed patients. Correlation coefficient (r) and MSE between observed and predicted score were also calculated. The significance of r was calculated using standard parametric conversion and p<0.05 was considered as statistically significant.
5. Functional networks anatomy of constructed connectome model
5.1 Grouping nodes into seven functional networks
These 268 nodes were divided into seven canonical functional networks according to the anatomical order and previous studies [26, 31], which include frontoparietal (63 nodes), default mode (20 nodes), motor (50 nodes), visual-related (45 nodes), limbic (30 nodes), basial-ganglia (29 nodes) and cerebellum (31 nodes) networks. The visual-related network includes visual I, visual II, and visual association networks. Map of these seven networks was shown in the supplementary material (Figure S1). Connections within each network (within-network connection) were calculated by summarizing the number of connections that connecting nodes among it, and connections between two distinct networks (between-network connection) were calculated by summarizing the number of connections that coupled nodes from them.
5.2 Contribution of each functional network
We weighed each functional network's contribution by calculating the sum of consensus positive and consensus negative connections that belong to it, and a higher number of connections indicates a greater contribution. To further access the importance of each functional network to the motor impairment prediction, we computationally “lesioned” the model to exclude connections from it. That is, in an iterative analysis, we masked the connection matrix to exclude connections that appeared in one of the seven functional networks. For example, after excluding connections in the frontoparietal networks, which contained 63 nodes, a 205 205 matrix rather than 268 268 matrix was submitted for analysis. Models with lesioned matrices, which are defined as lesioned models, were first constructed and evaluated among drug-naive patients, then they were validated among drug-managed patients. The predictive ability of each “lesioned” model was compared with the original one by using Steiger’s test and Bonferroni correction.
5.3 Two connection patterns divided from the connectome model
To better understand the relationships between PD motor impairment and various within- and between-functional connections, we divided the connectome model into two connection patterns. One pattern that contained all consensus negative connections was called the “negative motor-impairment-related network", while another pattern that contained all consensus positive connections was called the " positive motor-impairment-related network". Characteristics of within- and between-network connections were analyzed by summing the consensus connections using the above-mentioned seven canonical functional networks in the positive and negative motor-impairment-related networks. To control the effect of network size variations, the proportion of the connections within and between networks was also calculated by the actual number of connections / total number of possible connections [17].
6. Statistical analyses
Clinical characteristics of drug-naïve and drug-managed patients were analyzed by using SPSS software (Version 25). Kolmogorov-Smirnov test was applied to identify the normal distribution of continuous variables. The difference of normally distributed variables between two groups was analyzed with the two-sample t-test, otherwise, the Mann-Whitney test was conducted. And differences between qualitative variables were compared with the Chi-square test. Statistical analyses of model construction, validation, and evaluation were performed by using MATLAB (R2020b, MathWorks) and had been detailed in corresponding parts mentioned above. A two-sided p-value <0.05 was considered significant unless mentioned.