Quantifying rsfMRI complex dynamics and cognitive phenotypes We used preprocessed rsfMRI data from 20,000 unrelated UK Biobank participants for this study [54] (see Methods). We calculated four TC measures and five ROI-wise FC-derived measures (see Methods). As prediction targets, we chose the four most reliable cognitive phenotypes in the UK Biobank database, measuring fluid intelligence, processing speed, visual memory, and numerical memory [55]. See Table S1 in the Supplementary Materials for the full list of features and targets.
We used kernel ridge regression with l2-norm regularization for predictive modeling, a widely used prediction method [5], [56], [57], to assess rsfMRI markers’ cognitive phenotypic prediction ability. Model performance was measured through cross-validation using the Pearson correlation between the real and predicted targets (kernel ridge regression) or the balanced accuracy for gender classification (ridge classification). Model hyper-parameterization was done using nested cross-validation. Individual characteristics, i.e., age, gender, and TIV, were addressed through four scenarios, outlined in Figs. 1-B.1 to B.4 (see also Methods).
Larger sample sizes increase accuracy but eventually reach a plateau. First, we examined whether increasing the sample size could improve the prediction accuracy of cognitive phenotypes in all four scenarios (Fig. 1). As illustrated in Fig. 2, increasing the number of subjects improved accuracy most of the time, but the performance curves reached a plateau when using approximately more than 2,000 participants.
As a sanity check, we tested all the predictive modeling scenarios using fish consumption (the day prior to fMRI) as a target presumably unrelated to the rsfMRI features. The performance for all sample sizes, rsfMRI features, individual characteristics, and their combinations remained at chance level (Fig. 2).
Temporal complexity and FC features show comparable predictive capacities. Next, we investigated how TC and FC measures compare in cognitive phenotype prediction across different sample sizes. The average performance of ridge regression models suggested that certain features, specifically fALFF, LCOR, wPE, and RangeEnB, performed better than others in all contexts, regardless of the target. Both types of features were situated at the lower and upper bands of prediction accuracy. The correlation between actual and predicted targets remained below 0.35 even at the maximum sample size. Voxel-based local brain activity measures of fALFF and LCOR showed the highest predictive capacity among the FC measures. Among TC measures, wPE and RangeEnB resulted in the highest accuracy, comparable to fALFF and LCOR.
Even with a sample size of 20,000 individuals, not all cognitive phenotypes could be predicted with equal accuracy (Fig. 2). Processing speed was predicted with the highest correlation coefficient of up to 0.35, followed by fluid intelligence with 0.25 when using fALFF together with the individual characteristics. For all combinations, predictions for visual memory and numeric memory scores were less accurate with a correlation accuracy of less than 0.2. The prediction accuracy of processing speed was again higher than that of the other three cognitive phenotypes when using only the individual characteristics (age, gender, and TIV) (Scenario 4, see Fig. 1). However, as shown in the black colored curves of Fig. 2, the predictability of fluid intelligence, visual memory, and numerical memory scores was close to each other. Worthy to note that in all cases, removing the individual characteristics from cognitive phenotypes worsened the predictive performance.
Age, gender, and TIV result in higher accuracy than rsfMRI features. Next, we tested how age, gender, and TIV predict cognitive performance when used as sole input features and without any rsfMRI data involved. To this end, we used these individual characteristics as input to the kernel ridge regression to predict cognitive phenotypes (Fig. 1-D.1). As shown in Fig. 2 and Supplementary Figure S1, this approach resulted in the highest correlation between actual and predicted targets across all sample sizes, outperforming all scenarios where rsfMRI features were utilized (Fig. 1-D.1 to D.3). When individual characteristics served as input features, the sample size required to reach the plateau was substantially lower (lower than 500 subjects; see Fig. 2). In other words, the ability of individual characteristics to predict cognitive phenotypes from a small sample size was better than the ability of rsfMRI features to predict the same targets, even when a larger sample size was used.
Given that the individual characteristics outperformed rsfMRI features in predicting cognitive phenotypes, the next logical step was to combine the TC and FC features with individual characteristics and see if it improved the prediction accuracy. For all rsfMRI features, this scenario produced the highest prediction accuracy of the first three analysis scenarios using rsfMRI features (Fig. 2C, see also Supplementary Figure S1). This result shows that the inclusion of individual characteristics such as age, gender, and TIV may improve the performance of rsfMRI features, in particular for large sample sizes. The distinction between combined rsfMRI features and individual characteristics (Scenario 3) and rsfMRI features only (scenarios 1 and 2) was more pronounced when predicting processing speed in comparison to the other three cognitive phenotypes. Additionally, when the rsfMRI features were combined with individual characteristics and with larger sample sizes, the prediction accuracy was more similar (Fig. 2, Scenario 1 versus Scenario 3).
The temporal signal-to-noise ratio plays no major role. We then asked if background noise in rsfMRI data affects prediction performance. To this end, we investigated whether excluding brain regions with high noise levels would increase prediction accuracy. We used a group-level tSNR map to threshold the rsfMRI feature maps (see Methods). Figure 3 illustrates the prediction accuracies for fluid intelligence using rsfMRI features (Scenario 1), rsfMRI features after removing individual characteristics (Scenario 2), and when combining rsfMRI features with individual characteristics (Scenario 3), following stepwise thresholding on the tSNR maps from 0% (no threshold, corresponding to results in Fig. 2) to 60% with 5% increments. Prediction accuracies improved with increasing sample size and the number of suprathreshold ROIs. The results were similar for the other cognitive phenotypes (see Supplementary Figures S3–S5). Prediction accuracy for fish consumption remained at chance-level for all tSNR thresholds (Supplementary Figure S6).
Age and gender are easier to predict than cognitive phenotypes. We investigated the capability of the rsfMRI features to predict individual characteristics. Compared to the prediction of four cognitive phenotypes, the prediction accuracy of all rsfMRI features was higher for both age and gender prediction (Fig. 4). Two TC features, wPE and RangeEnB, performed best at large sample sizes, as well as fALFF and LCOR (FC), with correlation coefficients of up to 0.5. This accuracy was considerably better than the prediction accuracy of cognitive phenotypes, which was typically less than 0.25 (see Fig. 2). This result was noticeably different when individual characteristics were used as features for predictive modeling (gender and TIV for age prediction, and age and TIV for gender prediction). Gender could be classified using age and TIV with 88% accuracy. However, the individual characteristics did not perform well in age prediction, with a Pearson correlation of 0.2 between actual and predicted values.
Similar individual patterns across rsfMRI features. We looked into how much information rsfMRI features with comparative prediction capacity share with one another. Our findings show that some rsfMRI features have comparable predictive capacity, despite their mathematical definitions and interpretations being quite different. For instance, fALFF and wPE were frequently among the most predictive features across three analysis scenarios, despite describing different aspects of rsfMRI. To check how well different rsfMRI features match with each other, we quantified the similarity between them using the identification accuracy score (see Methods). A number of rsfMRI feature pairs showed a high level of match (Fig. 5). The pairs wCC-EC, wPE-RangeEnB, fALFF-LCOR, and MSE-HE were among the most highly matched. The identification accuracy changed when individual characteristics were removed or when rsfMRI features were added to individual characteristics (Fig. 5, panels B, C, and D). Importantly, identification accuracy decreased as the number of subjects increased. This was in contrast to the increase in prediction accuracy (Fig. 2).