Advancing Tau-PET quantification in Alzheimer’s disease with machine learning: introducing THETA, a novel tau summary measure

Alzheimer’s disease (AD) exhibits spatially heterogeneous 3R/4R tau pathology distributions across participants, making it a challenge to quantify extent of tau deposition. Utilizing Tau-PET from three independent cohorts, we trained and validated a machine learning model to identify visually positive Tau-PET scans from regional SUVR values and developed a novel summary measure, THETA, that accounts for heterogeneity in tau deposition. The model for identification of tau positivity achieved a balanced test accuracy of 95% and accuracy of ≥87% on the validation datasets. THETA captured heterogeneity of tau deposition, had better association with clinical measures, and corresponded better with visual assessments in comparison with the temporal meta-region-of-interest Tau-PET quantification methods. Our novel approach aids in identification of positive Tau-PET scans and provides a quantitative summary measure, THETA, that effectively captures the heterogeneous tau deposition seen in AD. The application of THETA for quantifying Tau-PET in AD exhibits great potential.


Introduction
Alzheimer's disease (AD) is characterized by the accumulation of β-amyloid (Aβ) plaques and neurofibrillary tangles (NFTs) in the brain.NFTs are composed of hyperphosphorylated tau proteins and in a majority of individuals tau progresses along predictable patterns, originating in the transentorhinal cortex and spreading to the limbic system and eventually to the neocortex.
The spread of tau leads to cognitive impairment and dementia 1 .However, evidence from pathology and imaging have shed light on the heterogeneity of tau deposition in AD, suggesting that there could be distinct patterns of tau accumulation across individuals [2][3][4] .
Current understanding of AD pathophysiology and neurodegeneration suggests that the NFT accumulation is closely correlated with clinical disease progression and precedes clinical symptoms, making tau a promising biomarker for disease diagnosis and clinical trial design 5,6 .
Positron emission tomography (PET) imaging is used to visualize and assess tau deposition using radioligands that bind specifically to the paired helical filament of NFTs and can be used to detect and track tau pathology in vivo 7 .Studies using PET have shown in preclinical AD, tau deposition is spread throughout several cortical regions and there follows multiple trajectories 3 .The most common quantification methods for Tau-PET utilize meta-regions of interest (meta-ROIs), such as the temporal meta-ROI, or the more recent medial temporal lobe (MTL) and neocortical (NEO) meta-ROIs to stage disease severity 8,9 .These methods ignore the extent of tau outside these meta-ROIs and average the Tau-PET standardized uptake value ratios (SUVR) in the entire meta-ROI, which underweights any focal depositions of tau in smaller regions within the meta-ROI.In addition to the meta-ROIs, there are less commonly used quantitative methods such as the volumes-of-interest voxel-based multiblock barycentric discriminant analysis (MUBADA) 10 that have also been used to assess the clinical group separation.The visual rating method followed in this study was based on the density and distribution of tau identified by the radiotracer [ 18 F]flortaucipir (Tauvid™) which was recently FDA-approved for AD tau pathology at B3-level (Braak stages V/VI) 11 .The visual assessment criteria consider the focal deposition of tau through the brain and could overcome the limitations of the meta-ROI methods.
In this work we set out to test the hypothesis that a machine learning (ML) model can be developed to identify positive Tau-PET scans based on the clinically accepted multirater visual ratings, and improved quantification methods can be developed to incorporate the heterogeneity in spatial distribution of tau tracer signals throughout the brain.We further hypothesized that these ML-based tau quantification methods could outperform the currently used meta-ROI quantification methods and provide a more accurate and sensitive quantification of tau deposition that would map better to disease severity.
To test our hypotheses, we designed our study with three aims: 1) develop a machine learning model on a large single site dataset using regional SUVR values as inputs and visual ratings as targets and validate the model's performance on two external independent cohorts, 2) compare the performance of our ML model to temporal, MTL and NEO meta-ROI quantitative methods, and 3) develop a novel summary measure that is more sensitive to clinical disease severity by leveraging the regional heterogeneity captured by our ML model.This study aims to address the limitations in the current quantitative methods for tau deposition in AD by utilizing advanced ML approaches.

Model trained on visual ratings for predicting tau positivity
The regional SUVRs were the inputs to the ML model and the visual classifications were the predicted class (Fig. 1).The model was trained on the Mayo dataset and tested on ADNI and OASIS-3.To validate the model, we conducted multiple runs using different data splits (Fig. 2).
The models' performance was consistent as indicated by a standard deviation less than 5% for all metrics (Fig. 2).We then selected the best model with the highest f1-score.The best model performed very well in predicting tau status on the Mayo dataset, achieving a balanced accuracy of 98.58% and 95.43% on the Mayo training and testing sets, respectively.
When evaluating the model's performance on the external datasets, ADNI and OASIS-3, it achieved a balanced accuracy of 87.74% and 87.03%, respectively.The model identified taupositive and negative participants with an AUC of 1.00 on the testing set.It also showed very good classification performance on the ADNI external dataset, with an AUC of 0.96.In contrast, the AUC was lower in the OASIS-3 dataset at 0.94 (Fig. 2). 126

Model performance in comparison to meta-ROI-based assessment for prediction of tau positivity
The meta-ROIs showed very similar performances in classifying tau positivity in the Mayo cohort, with an AUC of 0.99 on the test-set (20%) and 0.94 on the whole dataset (Fig. 2B and Fig. 2C).
The model outperformed all three meta-ROIs when evaluating classification performance on the Mayo dataset, with a misclassification of 3.67% and 0.48% of tau-positive and negative cases, respectively.On the ADNI dataset, the model misclassified 22.17% of the tau-positive and 2.33% of the tau-negative cases and was largely outperformed by the temporal meta-ROI for tau-positive misclassification at a rate of 6.96% (Table 2).On the OASIS-3 dataset, the model performed best in classifying tau-negative cases with a misclassification rate of 1.36% and had the second-best misclassification rate of 24.59%, outperformed by the temporal meta-ROI at 18.03%.
Supplementary Tables 1 and 2 provide similar analyses for participants with CI and CU clinical diagnosis.

Spatial heterogeneity captured by the machine learning model
To assess the spatial heterogeneity captured by the model, we analyzed the SHAP (SHapley Additive exPlanations) 12 summary plots for tau in the different regions of the brain.In participants with tau positivity in the NEO region, the inferior temporal cortex region was the top predictor (Fig. 3).Conversely, in participants with tau positivity in the MTL region (the region well-known to be affected by tau deposition), the entorhinal cortex region emerged a crucial predictor (Fig. 3).The THETA score considers the contribution of all the regional tau SUVRs used to the determine a tau-positive or tau-negative scan.Here we illustrate THETA in two sub-populations that highlight tau heterogeneity: discordant and concordant groups.The discordant group consist of cases where there is disagreement between the visual rating and one or more of the meta-ROI classifications while concordant group consists of cases that agree both visual and with the meta-ROIs (Fig. 4).
The THETA score, as described in Equation 2(section 4.6), was developed to combine different regions based on their contribution to both classification and disease severity, as indicated by the SUVRs.In the tau-positive and meta-ROI negative discordant cases where the model contribution is distributed amongst different regions and not focused specifically on meta-ROI regions, the THETA formulation successfully captures the heterogenous contributions of all the regions, including those with relatively mild signals and similar contributions (Fig. 5A).On the other hand, in tau-positive concordant cases, the hotspot regions that constitute the meta-ROIs are the top predictors in our ML model.In these cases, the THETA formulation maintains the importance of the top regions, thereby preserving the spatial heterogeneity (Fig. 5B).

B. Mayo concordant cases TV+, M+ A. Mayo discordant cases TV+
Figure 5.The average regional THETA scores ranked in ascending order by median value for discordant cases (left) and concordant cases (right).The discordant cases which were visually positive (TV+) and negative with one or more meta-ROIs, and the concordant cases which were tau-positive (TV+ M+) both visually and all three meta-ROIs.

Performance of THETA for assessing disease severity
The performance of the tau summary score THETA for disease severity was assessed using two clinical disease severity measures, Mini-Mental State Examination (MMSE) and CDR sum of boxes (CDR-SB).
When correlation was conducted for all participants from each cohort, the performance of the THETA score and the meta-ROIs was similar (Fig. 6, OASIS-3 shown in Supplementary Fig. 2).
When looking at the relationship of MMSE to the meta-ROIs and THETA, there was a similar trend of decreasing slope from tau-negative to tau-positive (Fig. 6).However, the THETA score provided a clearer and more distinct separation between tau-positive and negative participants (Fig. 6).This pattern was also observed in the concordant groups (Fig. 7).In contrast, for the discordant groups, THETA demonstrated a negative and significant association with MMSE and a strong positive association to CDR-SB, but the meta-ROIs were not significantly associated with MMSE (Fig. 7).Similar analysis with possible outliers excluded is shown in Supplementary Figure 3.
Furthermore, we compared THETA to the temporal meta-ROI for different clinical diagnostic outcomes and calculated the mean differences between tau-positive and tau-negative cases (Fig. 8).We found that for the AD Dementia participants the separation between the tau-positive and tau-negative cases created by both temporal Meta-ROI and THETA were similar in terms of statistical significance across the disease groups.However, for CU and MCI participants there was a clear overlap in tau status for the temporal Meta-ROI, whereas the THETA score showed better separation between tau-positive and tau-negative cases (Fig. 8).For instance, in the ADNI cohort, the difference between the tau-positive and tau-negative temporal Meta-ROI values for CU and MCI participants had an effect size of 3.08 (t-statistics = 16.50, p < 0.001) and 2.23 (tstatistics = 16.76,p < 0.001), respectively.In contrast, the THETA score showed a much larger effect size of 10.09 (t-statistics = 54.09,p < 0.001) and 6.83 (51.36, p < 0.001), respectively (Fig.

A. Mayo
All correlations are significant p < 0.05.

Discussion
The progression of tau pathology, as captured by Tau-PET scans, has become a key indicator of disease severity in AD.However, current methods have limitations in addressing the heterogeneity of tau deposition.They focus on a limited number of regions with typically high tau uptake while ignoring the spatial variance of tau burden within these regions.These two limitations hamper the performance of meta-ROI-based methods for accurate detection and quantification of the Tau-PET signal.Using visual assessment by three raters as the gold standard in a large single site dataset (Mayo), we developed a ML model to accurately classify the status of Tau-PET scans and validated it in two independent datasets (ADNI and OASIS-3).We then utilized the model to develop a novel tau summary measure that considers tau SUVRs across the brain and provides a metric that maps extremely well to disease progression compared to current methods.

Identification of positive Tau-PET scans
The application of deep learning and ML using Tau-PET has become common in recent years, either to improve PET image acquisition 13 , to classify spatial patterns 14,15 , to study the association between Aβ and Tau-PET scans 16 , or to predict pathological tau accumulation from clinical measures 17,18 .ML-based indices have also been introduced such as Spatial Pattern of Abnormality for Recognition of Early Tauopathy (SPARE-Tau) 19 and Alzheimer's disease resemblance atrophy index (AD-RAI) 20 .SPARE-Tau was trained on tau SUVRs to predict clinical status (CU vs MCI/AD) while AD-RAI was trained on T1-weighted MRI volumetric measures also to predict clinical status and quantify brain atrophy.Nonetheless, our work is the first to develop and validate a ML model to identify positive Tau-PET scans using regional SUVRs from the entire brain.We validated our ML model with entirely independent datasets comprised of different population demographics and data sources.More importantly, our model was able to generalize throughout the entire brain, we incorporated the THETAi values to formulate our summary measure.This measure was derived using SHAP values, which indicated the importance of each individual region.Thus, by utilizing the heterogeneity captured by the ML model, we were able to ensure that THETA captured pattern-based information.This is illustrated by the regional THETA scores for the concordant or discordant subgroups (Fig. 3 and Fig. 5).Furthermore, since the range of THETA scores were distinct for the tau-positive/negative cases, we were able to get a clear separation between the tau-positive and negative participants for the MMSE clinical score (Fig. 6 and Fig. 7) and the diagnostic groups better than the temporal Meta-ROI (Fig. 8).

THETA score for assessing disease severity
Tau is a proximal surrogate of clinical disease severity and Tau-PET has tremendous potential to THETA can be utilized with ease across multiple clinical studies.The calculation of THETA in a clinical or research setting is similar to the meta-ROI calculation.Once an ML model is trained on the regional SUVRs and is interpreted using the SHAP AI explainer, THETA scores can be generated automatically using our formula.This process can be done for a single participant or a list of participants.The training of a ML model need only be done once and the trained model can be used multiple times, and the training set can constitute cohorts of different demographics as classifying tau positivity on the Mayo test set and was comparable in ADNI and OASIS-3.
Additionally, the novel summary measure, THETA, was able to better quantify the spatial heterogeneity of tau deposition and provide a more sensitive measure of clinical disease severity.
Overall, the study provides promising results for the use of ML models in improving the detection and quantification of tau pathology in Alzheimer's disease.
we evaluated the performance of THETA on the clinical disease severity measures by calculating correlation using Spearman ℎ and a linear estimation of slope and intercept using ordinary least squares.Lastly, we evaluated the separation between tau-positive and tau-negative for the different clinical diagnosis groups using Cohen's  for effect size and performed mean comparison using two-tailed independent samples t-test with Bonferroni correction for multiple comparisons.
The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Figure 1 .
Figure 1.Study design.First, we trained a machine learning (ML) model using a library of visually assessed scans where the visual rating was used as the ground truth and the SUVRs were the inputs.Second, after training the model we applied the SHAP AI explainer to determine each region's contribution to the predicted visual rating.Lastly, we derived a summary measure we are calling tau heterogeneity evaluation in Alzheimer's disease (THETA) score using each participant's SUVR value and corresponding SHAPs.

Figure 2 .
Figure 2. Model performance for binary classification of tau status based on the visual assessment from the three raters.The model was trained on the Mayo and validated on the external validation sets, ADNI and OASIS-3.The top table shows summary of the multiple runs conducted using different random splits of the training (80%) and testing (20%) sets.The metrics in the table show the mean (standard deviation).The receiver operating characteristic's area under the curve (AUC) of the model (A) compares its performance in Mayo, ADNI, and OASIS3, while (B) and (C) illustrate the comparison of the model's performance to meta-ROI classification schemes in the Mayo testing and whole dataset respectively.

Figure 3 . 5 .
Figure 3. Feature importances for cases where tau was positive in the MTL meta-ROI only and in NEO meta-ROI only.The arrow indicates the importance of the entorhinal region changing its rank depending on the regionality for TMTL + , TNEO -(left) cases, and for TMTL -, TNEO + (right). 195

Figure 6 .
Figure 6.Comparison of the meta-ROIs and THETA score to the clinical measures MMSE and CDR-SB.The correlation coefficients are Spearman's ℎ and the scatter plot shows the ordinary least squares regression.Similar results for the OASIS-3 cohort are included in Supplementary Figure 2. Tau-and Tau+ labels indicate visual assessment status.

Figure 7 .Figure 8 .
Figure 7.Comparison of the meta-ROIs and THETA to clinical scores MMSE and CDR-SB for the Mayo cohort in the discordant and concordant group.The discordant group consisted of participants with disagreement between the visual rating and one or more meta-ROIs on the tau status, and the concordant group consists of participants whose tau status had agreement between the visual and all three meta-ROI methods.A similar analysis with outliers removed is included in Supplementary Figure 3. Tau-and Tau+ labels indicate visual assessment status.
significantly impact clinical practice and clinical trials.The FDA approved [18F] flortaucipir PET imaging for detecting NFT B3 corresponding to Braak stages V or IV.Hence, effectively quantifying the Tau-PET signal has important implications because it provides a more accurate and sensitive assessment of disease severity.Given that multirater visual assessment is the clinically accepted standard in the field, developing a highly accurate model using this gold standard and utilizing the model characteristics for quantification of Tau-PET signal has several advantages.This is reflected in the THETA score outperforming the current methods as observed in Figures6 -8.Additionally, the THETA scores mapped on to cognitive indices comparably or better than meta-ROI-based methods.

Table 1 .
Characteristics summary of study population.111

Table 2 .
Comparison of Meta-ROI-based assessments and the machine learning model predictions to the visual ratings when predicting tau positivity.