Prediction of H3K27M‐mutant brainstem glioma by amide proton transfer–weighted imaging and its derived radiomics Zhizheng Zhuo1 · Liying Qu1 · Peng Zhang2 · Yunyun Duan1 · Dan Cheng1 · Xiaolu Xu1 · Ting Sun1 · Jinli Ding1 · Cong Xie1 · Xing Liu3 · Sven Haller4 · Frederik Barkhof5,6 · Liwei Zhang2 · Yaou Liu1

Purpose H3K27M-mutant associated brainstem glioma (BSG) carries a very poor prognosis. We aimed to predict H3K27M mutation status by amide proton transfer–weighted (APTw) imaging and radiomic features. Methods Eighty-one BSG patients with APTw imaging at 3T MR and known H3K27M status were retrospectively studied. APTw values (mean, median, and max) and radiomic features within manually delineated 3D tumor masks were extracted. Comparison of APTw measures between H3K27M-mutant and wildtype groups was conducted by two-sample Student’s T/ Mann–Whitney U test and receiver operating characteristic curve (ROC) analysis. H3K27M-mutant prediction using APTwderived radiomics was conducted using a machine learning algorithm (support vector machine) in randomly selected train (n = 64) and test (n = 17) sets. Sensitivity analysis with additional random splits of train and test sets, 2D tumor masks, and other classifiers were conducted. Finally, a prospective cohort including 29 BSG patients was acquired for validation of the radiomics algorithm. Results BSG patients with H3K27M-mutant were younger and had higher max APTw values than those with wildtype. APTw-derived radiomic measures reflecting tumor heterogeneity could predict H3K27M mutation status with an accuracy of 0.88, sensitivity of 0.92, and specificity of 0.80 in the test set. Sensitivity analysis confirmed the predictive ability (accuracy range: 0.71–0.94). In the independent prospective validation cohort, the algorithm reached an accuracy of 0.86, sensitivity of 0.88, and specificity of 0.85 for predicting H3K27M-mutation status. Conclusion BSG patients with H3K27M-mutant had higher max APTw values than those with wildtype. APTw-derived radiomics could accurately predict a H3K27M-mutant status in BSG patients.


Introduction
Brainstem gliomas (BSGs) are a heterogeneous group of tumors involving midbrain, pons, and medulla. Genetic characterization is able to identify BSG patients with a poorer prognosis in those harboring a methionine in histone H3 at lysine 27 (H3K27) mutation [1], which causes an oncogene expression through global depletion of the repressive modification H3 lysine 27 trimethylation (H3K27me3) [2]. Accurate identification of H3K27M status contributes to diagnostic accuracy, improves determination of prognosis and treatment response, and allows identification of potential therapeutic targets [1][2][3].
Invasive stereotactic biopsy can identify the H3K27Mmutant status with a high sensitivity and specificity, but carries a significant risk of complication [4]. Additionally, the inherent heterogeneity of BSG poses the risk of a biopsy bias, while the alternative cerebrospinal fluid (CSF)-derived tumor DNA analysis approach might be inconclusive if the tumor is not directly adjacent to the CSF [3][4][5]. Non-invasive conventional magnetic resonance imaging (MRI) is able to depict BSG location, morphology, diffusion, perfusion, and metabolic characteristics, but is unable to predict genetic mutation status with insufficient specificity (< 75%) [6][7][8][9]. Therefore, the accurate non-invasive determination of genetic H3K27 mutation in BSG remains a challenge in clinical practice.
Amide proton transfer-weighted (APTw) imaging is a novel MRI technique, which generates image contrast based on the endogenous amide protons in mobile cellular proteins and peptides [10], providing a promising avenue to explore the molecular metabolism associated with the cellular proliferating and gene expression. To fully appreciate tumor heterogeneity, radiomic features from conventional MRI have been successfully used to predict genetic alterations (e.g., isocitrate dehydrogenase [IDH] mutation, H3K27M) in brain tumors [11][12][13][14]. So far, there have been no studies using APTw and its derived radiomics to predict H3K27M mutation status in BSG.
In this study, we examined whether APTw and APTwderived radiomic features could characterize the BSG metabolic heterogeneity and predict H3K27M mutation status. To this end, a retrospective study was conducted to build and test a prediction model to predict H3K27M-mutation status, which was validated in an independent prospective cohort.

Ethics approval and study design
This study was approved by the Animal and Human Ethics Committee of Beijing Tiantan Hospital, Capital Medical University. All the patients or legal guardians provided written informed consents.
In this study, data from a retrospective cohort was examined to build and test the H3K27M-mutant prediction model and an independent prospective cohort was then studied to validate the prediction model (Fig. 1).

Retrospective cohort
From December 2018 to June 2020, a total of 87 patients diagnosed with BSG (aged 2-65 years old) who had both APTw imaging and available H3K27M status were retrospectively studied (representative patients were showed in Fig. 1 Flowchart of the retrospective cohort for train and test a H3K27M-mutant prediction model and the prospective cohort for independent validation in this study Fig. 2A). Exclusion criteria included (1) patients with posttreatment (surgery, radiotherapy, or chemotherapy) APTw data; (2) patients with poor imaging quality (Fig. 1).

Prospective cohort
From June 2020 to December 2020, 30 patients with clinically and radiologically suspected BSG were prospectively recruited for APTw imaging. Inclusion criteria included (1) patients without prior treatment history; (2) patients would receive biopsy or surgical resection; (3) patients would undergo H3K27M status testing. Exclusion criteria included (1) patients were not pathologically diagnosed as glioma; (2) patients with poor image quality (Fig. 1).

MRI acquisition
MRI acquisition was performed on a 3 T MR scanner (Ingenia CX, Philips Healthcare, Best, the Netherlands) with a 32-channel head receiver coil. The MR protocol included T1w, T2w, T2-FLAIR, diffusion-weighted imaging, and contrast-enhanced T1w. For APTw, the parameters of the 3D turbo spin echo were repetition time 5864 ms, echo time 8.8 ms, flip angle = 90 • , acquired voxel size = 2 mm × 2 mm × 6 mm, 7 slices acquired. A saturation pulse was applied at + 3.5 ppm (relative to the water resonance frequency) to saturate amide protons, saturation B1 amplitude = 2 uT, saturation duration = 2 s. Six additional image volumes at critical frequencies (± 3.1 ppm, -3.5 ppm, ± 3.9 ppm, and − 1560 ppm) were acquired for the Z-spectrum normalization and interpolation, and three acquisition were performed at + 3.5 ppm with different echo time shifts to obtain a Dixon-type B0 field map to correct for B0 inhomogeneities in the Z-spectrum frequency domain [15], parallel acceleration factor (SENSE) = 1.6, acquisition time = 1 min 54 s. Details of conventional pulse-sequences can be found in Supplemental Table S1.

APTw image calculation
APTw images were automatically calculated using the embedded post-processing pipeline on the MR console.
An MTRasym analysis (asymmetry with respect to the water frequency) was conducted by a voxel-by-voxel analysis to distinguish the APT signal from background effects (e.g., direct water saturation and magnetization transfer contributions from semi-solid tissue components) [16].
First, the Z-spectrum was aligned per voxel using the B0 field map to correct for B0 inhomogeneity artifacts. Next, asymmetry was calculated by subtracting the positive frequency side signal S[+Δ ] from the negative side signal S[−Δ ] and normalized to the unsaturated image signal S0 (see Eq. 1). The resulting MTRasym value at + 3.5 ppm is presented as percent level (relative to S0) in the final APTw images (Eq. 2).

Tumor segmentation and radiological analysis of tumor
The solid tumor boundaries (excluding cystic, necrotic and hemorrhagic areas) were manually delineated on T2w images independently by two radiologists (L.Q and D.C, both with 3 years' experience in neuroradiology), who were blinded to the H2K27M status, with contrast-enhanced T1w and T2-FLAIR as reference, using 3D Slicer software (v4.10.2, https:// www. slicer. org/). In case of discrepancy between the readers, the 3D tumor mask was reviewed, modified (if necessary) and confirmed by a senior radiologist (Y.D with 12 years' experience in neuroradiology). Finally, the overlapping regions of tumor masks drawn by the two radiologists (average dice score = 0.87) were used to determine the tumor volume. Using this mask, voxel-wise lesion frequency mapping was performed by normalizing the tumor mask to Montreal Neurological Institute (MNI) coordinates using Statistical Parametric Mapping (SPM12, https:// www. fil. ion. ucl. ac. uk/ spm/) and dividing the summed binary tumor masks by the total number of enrolled patients (Fig. 2B).
Based on conventional MRI (T1w, T2w, and contrastenhanced T1w), the tumor location (midbrain, pons, and medulla) and presence of dorsal exophytic components, hydrocephalus, cystic components, contrast enhancement, and engulfment of basilar artery were assessed by two independent (L.Q and D.C) radiologists; to resolve inconsistencies, a senior radiologist (Y.D) reviewed and determined the final assessments.

APTw feature extraction
In order to extract the APTw measures and radiomics of BSG, 3D tumor masks were warped to the APTw image space by the forward transformation parameters derived from the affine registration of the T2w images to the unsaturated images (S0 image). Mean, median, and max APTw values within the 3D tumor masks were extracted using an in-house script in MATLAB 2019a (MathWorks, Natick, MA, USA).

Histopathological analysis
Tumor grading was determined according to the 2016 World Health Organization Classification of Tumors of the Central Nervous System [17]. H3K27M status (H3K27M-mutant or wildtype) was determined by immunohistochemical staining using a mutation-specific antibody [18].

Statistical analysis
The statistical analysis was performed using SPSS (SPSS for Windows, version 25.0; IBM, Armonk, NY, USA) and statistics toolbox in Matlab 2019a (MathWorks, Natick, MA, USA).
Categorical variables are displayed as frequencies and tested using Chi-square test. Continuous variables are presented as median and interquartile range (IQR) and tested by Student's T test or Mann-Whitney U test. A p value < 0.05 was deemed statistically significant.
The spatial concordance of the 3D tumor segmentation masks between radiologists was evaluated using the dice score.
Univariable and multivariable logistic regression models (receiver operating characteristic curve (ROC) analysis) were used to evaluate the discriminatory ability of clinical and APTw measures for H3K27M-mutant versus wildtype status.

Radiomics prediction model development
A support vector machine (SVM) approach embedded in FAE (https:// github. com/ salan 668/ FAE) was adopted for the identification of H3K27M-mutant tumors based on radiomic features.
The retrospective cohort was used to build and test an SVM prediction model. Patients were first randomly separated into train (80%, n = 64 [46 H3K27-mutant and 18 wildtype]) and test (20%, n = 17, (12 H3K27 mutant and 5 wildtype)) datasets (Table 1). An up-sampling strategy was used to balance the patients with H3K27M-mutant and wildtype tumors in the training set. Z-score transformation was used for feature normalization. Pearson's correlation coefficient (PCC) was used for dimensionality reduction by randomly removing one of the paired features with high similarity (PCC > 0.9) [19] and relief-based feature selection [20]. Leave-one-out cross-validation (LOOCV) was used for the model parameter optimization based on the training set. Test data were used for evaluating the model performance using area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value, and negative predictive value.

Sensitivity analysis and independent validation
Sensitivity analysis were performed to exclude potential effects of (1) unbalanced separation of train and test datasets by performing 9 additional repeats of the SVM classification procedure; (2) dependency on 3D tumor masks by repeating the radiomics extraction and SVM classification using a 2D tumor mask from the axial slice with the maximum tumor area; (3) spurious selection of classifiers by comparing the current SVM findings with other popular classifiers including adaboost, auto encoder, logistic regression, least absolute shrinkage and selection operator (lasso) regression, linear discriminant analysis, Gaussian process, naïve Bayes, decision tree, and random forest.
The prospective cohort was used as an independent validation dataset to validate the built prediction model.

Demographic and clinical characteristics
For the retrospective cohort, 87 BSG patients were eligible. Two patients were excluded due to imaging acquisition after No difference was observed between H3K27M-mutant and wildtype patients for gender, lesion volume, and other histopathological features and conventional MRI characteristics (e.g., hydrocephalus, cystic component, contrast enhancement) ( Table 1).
For the prospective validation cohort, 30 BSG patients were enrolled. One patient was excluded due to poor image quality, leaving 29 patients including 16 H3K27M-mutant (median age 8 years, female/male = 11/5) and 13 wildtype (median aged 23 years, female/male = 7/6) cases available for analysis. Demographics, histopathology, WHO grades, and conventional MRI presentations were largely similar to those in the retrospective cohort (Table 1).

APTw measures in H3K27M-mutant and wildtype patients
Higher max APTw values (median value = 7.60%) were observed in H3K27M-mutant tumors than those with wildtype (median value = 6.14%, p = 0.02), while no significant difference of mean or median APTw values were observed between groups (Fig. 3A). However, the max APTw values were insufficient to discriminate H3K27Mmutant from wildtype cases even when combined with clinical information (accuracy = 0.63) (Fig. 3A).

Prediction of H3K27M status using APTw-derived radiomics
Based on the APTw-derived radiomic features identified using SVM in the retrospective cohort, the prediction of  3B and Table 2). The radiomic features with the most significant contributions in the final classifier included 18 features (GLSZM (n = 11), first-order (n = 4), GLDM (n = 1), GLRLM (n = 1), and GLCM (n = 1)) based on wavelet-transformed images, 2 shape features (elongation and flatness) based on original images, 2 features (NGTDM and GLSZM) based on logarithm-transformed images, 2 features (GLDM and first order) based on square transformed images, 1 feature (GLDM) based on exponentially-transformed images. When combined with clinical information (e.g., gender and age) and conventional MRI presentations (e.g., tumor locations), the prediction ability did not improve further (accuracy of 0.88 in test dataset).

Sensitivity analysis
Additional 9 random separations of the train and test sets demonstrated a comparable prediction ability (accuracy range of 0.71-0.94) with an average accuracy of 0.87. Radiomic features extracted from 2D tumor masks demonstrated a comparable predictive ability with an average accuracy of 0.84. Non-SVM classifiers achieved a comparable ability to predict H3K27M-mutant with accuracies ranging from 0.77 to 0.87.

Prospective independent validation
Validation of the prediction model based in the independent prospective cohort achieved an accuracy of 0.86, sensitivity of 0.88, specificity of 0.85, and AUC of 0.93 (Fig. 3C).

Discussion
In this study, we used APTw imaging and derived radiomic features to predict the H3K27M-mutant status among BSG patients. The primary findings were as follows: (1) BSG patients with a H3K27M-mutant tumor presented at a younger age and higher max APTw value than those with wildtype, but the max APTw values showed an insufficient ability (accuracy of 0.63) to predict H3K27M status; (2) APTw-derived radiomic features showed a good ability to predict H3K27M-mutant status with an accuracy of 0.88, sensitivity of 0.92, and specificity of 0.80 in test dataset; validation in an independent prospective cohort confirmed the findings with an accuracy of 0.86, sensitivity of 0.88, and specificity of 0.85. H3K27M mutations have been widely reported in diffuse midline glioma (DMG) and integrated into 2016 World Health Organization Classification of Tumors of the Central Nervous System [17], especially for pediatric diffuse intrinsic pontine glioma (DIPG) [21,22]. The younger age in H3K27M-mutant BSG patients in our cohort is consistent with previous reports [6]. Similarly, our findings of H3K27M-mutant status to be associated with higher WHO grade, more frequently locating at pons and presenting with engulfment of basilar artery, and less frequently presenting with dorsal exophytic component, are consistent with previous findings [23][24][25].
APTw imaging, as a novel MRI technique, provides semiquantitative amide proton mapping of the brain tumor, which characterizes the heterogeneous metabolism of proteins and peptides, likely reflective of histopathological and genetic alteration in glioma [26][27][28][29]. Previous studies on supratentorial gliomas demonstrated that higher APTw values indicated a high level of protein and peptide metabolism and allowed a semi-quantification of cellular proliferation, associated with a higher tumor grade and genetic mutation (e.g., IDH mutation) [9,[26][27][28][29]. The higher max APTw values in BSG patients with H3K27M-mutant observed in the current study mostly likely indicate active tumor cellular proliferation associated with an H3K27M mutation, as the global loss of H3K27 methylation results in increased cell proliferation potential [1,7,[27][28][29][30]. In addition, the alkaline intracellular microenvironment may also increase the APTw signal [31][32][33].
The BSG patients with either H3K27M-mutant or wildtype tumors presented with a heterogeneous APTw intensity distribution, probably reflective of the known inherent pathological heterogeneity of BSG [2,5]. Therefore, a single APTw value (mean, median, or max) was unable to characterize such heterogeneous tumors. Our study shows that APTw-derived radiomic features (e.g., GLSZM based on wavelet-transformed images) were better able to quantitatively capture the heterogeneity of image intensities across multiple image scales and provide more accurate image biomarkers to predict H3K27-mutant status.
The prediction accuracy of a single MRI modality of APTw (acquisition with less than 2 min) was 0.88 and 0.86 in the retrospective and prospective cohorts respectively, which was comparable or superior to those in previous radiomic studies in diffuse midline glioma and BSG based on conventional multimodal MRI (accuracy < 0.85 with FLAIR, T2w, T1w, and/or contrast-enhanced T1w) [11,12]. These findings support the hypothesis that APTw-derived radiomic features provide novel radiological biomarkers to help BSG staging, and might be contribute to the improvement of patient management and prognosis prediction. Its potential clinical value in the accurate diagnosis, treatment decision, prognosis prediction, evaluation of new therapy targets warrants further clinical validation.
Results of sensitivity analysis support the robustness of the SVM-based prediction models we developed. The randomly separation of train and test datasets, different schemes of feature extraction (3D vs. 2D mask), and different classifiers achieved comparable prediction performances compared to the primary results, which indicated our model is robustly reflecting intrinsic metabolic characteristics (presented by APTw) in BSG and predictive of H3K27M mutation status.
The strengthen of this study was the use of single APTw pulse-sequence and its derived radiomics to predict the H3K27M-mutant in BSG patients in both retrospective and prospective cohorts with a large sample size (110 BSG patients), but there are still some limitations in the current study. First, our study was a cross-sectional single center study using APTw images from one MRI scanner. Further studies using data from other scanners in a multicenter setting are required to validate the current findings and confirm their generalizability. Second, overall survival and treatment effect of the BSG patients was incompletely available; further study is needed to evaluate the ability of APTw imaging to predict BSG prognosis and treatment effect. Lastly, conventional machine learning methods were used, while deep learning (requiring a larger sample) may be able to achieve a better performance.

Conclusion
BSG patients with H3K27M-mutant present with higher max APTw values than those with a wildtype tumor. The APTw-derived radiomic features have a good ability to predict H3K27M-mutant status, providing novel MRI markers for non-invasive evaluation of genetic alterations in BSG.