Qualitative MR features to identify non-enhancing tumors within glioblastoma’s T2-FLAIR hyperintense lesions

To identify qualitative MRI features of non-(contrast)-enhancing tumor (nCET) in glioblastoma’s T2-FLAIR hyperintense lesion. Thirty-three histologically confirmed glioblastoma patients whose T1-, T2- and contrast-enhanced T1-weighted MRI and 11C-methionine positron emission tomography (Met-PET) were available were included in this study. Met-PET was utilized as a surrogate for tumor burden. Imaging features for identifying nCET were searched by qualitative examination of 156 targets. A new scoring system to identify nCET was established and validated by two independent observers. Three imaging features were found helpful for identifying nCET; “Bulky gray matter involvement”, “Around the rim of contrast-enhancement (Around-rim),” and “High-intensity on T1WI and low-intensity on T2WI (HighT1LowT2)” resulting in an nCET score = 2 × Bulky gray matter involvement – 2 × Around-rim + HighT1LowT2 + 2. The nCET score’s classification performances of two independent observers measured by AUC were 0.78 and 0.80, with sensitivities and specificities using a threshold of four being 0.443 and 0.771, and 0.916 and 0.768, respectively. The weighted kappa coefficient for the nCET score was 0.946. The current investigation demonstrated that qualitative assessments of glioblastoma’s MRI might help identify nCET in T2/FLAIR high-intensity lesions. The novel nCET score is expected to aid in expanding treatment targets within the T2/FLAIR high-intensity lesions.


Introduction
The extent of tumor resection is one of the most crucial factors determining a glioblastoma patient's prognosis [1][2][3].While the contrast-enhanced lesion on magnetic resonance image (MRI) has long been the primary surgical treatment target, it is also known that tumor cell extends beyond the contrast-enhanced lesion into the surrounding brain tissue and that some high-intensity regions on T2-weighted image or fluid-attenuated inversion recovery (T2/FLAIR) contain as much tumor cells as in contrastenhanced lesion [4].These lesions are often referred to as "non-enhancing lesions" or "non-(contrast)-enhancing tumor (NET or nCET)," and they are gaining attention to improve treatment outcomes, including attempts to include them in surgical targets [2,5].Thus, developing a method that can reliably identify nCET within the highintensity lesion on T2/FLAIR is crucial.However, various attempts, including those using machine learning [6], have failed in successfully developing such technology.This challenge can be attributed to difficulties in generalizing diagnostic models built by in-house datasets to real-world data [7] and more research is necessary to bring machine learning-based tumor segmentation into bedside clinics.On the other hand, several qualitative MRI features are reported to help localize nCET [8][9][10].However, they only compared the correlation between MRI features and the patient prognosis with little histological evidence.One of the radiological methods that enables direct tumor burden measurement in the brain is amino acid positron emission topographies (PETs), such as those using 11 C-methionine as tracers.There is a fair amount of evidence in the literature that the 11 C-methionine positron emission tomography (Met-PET) uptake ratio correlates with the magnitude of tumor burden within the brain for glioblastoma verified by ample amount of image-guided tissue biopsies [4,[11][12][13][14].The present research aimed to identify qualitative MRI features of nCET by comparing them with Met-PET, utilizing Met-PET as a surrogate imaging technique of tumor burden beyond contrast-enhanced lesions [11][12][13][15][16][17].The goal of the study is to establish a clinically feasible method that enables physicians to identify nCET by only using conventional structural MRI.

Patient cohort
Thirty-three histologically confirmed glioblastoma patients were included in the primary analysis.All patients underwent T1-weighted (T1WI), T2-weighted (T2WI), gadolinium-enhanced T1-weighted images (Gd-T1WI) and Met-PET as presurgical image workup.Furthermore, another four histologically confirmed glioblastoma patients who did not undergo Met-PET were used for image reading practice purposes.The pathological diagnosis was based on the 2016 World Health Organization classification for central nervous system tumors.The local institutional review board waivered written informed consent from patients for using clinical data for the present research (Study number: 21041).

C-Methionine positron emission tomography (Met-PET) acquisition
PET studies were performed using an Eminence-G system (Shimadzu, Kyoto, Japan), with 11 C-Methionine synthesized according to the method described by Hatakeyama et al. [18] and injected intravenously at a dose of 3 MBq/ kg body weight.Tracer accumulation was recorded in trans-axial sections over the entire brain, and the summed activity (the standard uptake value: SUV) from 20 to 32 min after tracer injection was used for image reconstruction.Images were stored in 256 by 256 by 59 or 99 anisotropic voxels, with each voxel being 1 by 1 by 2.6 mm, followed by dividing the entire SUV image with the normal-appearing gray matter's SUV contralateral to the lesion, which leads to obtaining a tumor-to-normal tissue ratio (T/N ratio) map.An area of high cell density was defined as those voxels presenting a T/N ratio > 1.5.This cut-off was derived from previous publications showing that the T/N ratio = 1.5 was roughly equivalent to tissues with a cell density of 2000 cells/mm 2 .As the cell density of healthy brain tissue ranges from 382 to 1106 cells/mm 2 , this cut-off was considered the most appropriate for defining regions carrying a high tumor load with confidence [4,12,13,15].

Image co-registration and target selection
T1WI, Gd-T1WI, and Met-PET were co-registered to T2WI by Vinci image-analyzing software (Max Planck Institute for Neurological Research Cologne; http:// www.nf.mpg.de/ vinci/, accessed on January 4, 2022) using normalized mutual information algorithm [19].We did not use a "maskbased region-of-interest" but instead focused on the diagnostic accuracy of randomly selected targets for assessing nCET, which allowed us to evaluate pinpoint diagnostic accuracy.Investigator 1 (SY) randomly selected the targets within the T2 high-intensity area outside the contrastenhanced lesion (Fig. 1a).Each target consisted of one voxel, and 156 targets were selected.
Fig. 1 The analytical scheme of this study is presented.Targets were randomly selected by investigator 1 from the area depicted by subtracting the contrast-enhancing region from the T2-FLAIR hyperintense lesion a.Three types of qualitative image characteristics were evaluated.Detailed definitions are described in the "Materials and methods" section

Qualitative evaluation of targets' imaging features on MRI
In this study, we focused on three imaging features, respecting but expanding the original concepts of previous reports [10,20].They are defined as "Bulky gray matter involvement," "Around the rim of contrast-enhancement (Aroundrim)," and "High-intensity on T1WI and low-intensity on T2WI (HighT1LowT2)" (Fig. 1b) with several examples demonstrated in Fig. 2 and Supplementary Figs.1-3.
Fig. 2 These figures demonstrate positive and negative findings of "Bulky gray matter involvement negative", "Around-rim" and "HighT-1LowT2" features.Larger images are provided in the supplementary figures.Of note, a semi-quantitative approach was introduced in this research to further render this imaging feature more reproducible.The image voxel values of the "target" and "reference" is displayed for the HighT1LowT2 feature

Bulky gray matter involvement
This feature was evaluated on T2WI and was considered positive If the target of interest involved gray matter or had a convex shape pushing out surrounding tissues.This imaging feature derives from and expands the nCET MRI feature reported as "gray matter involve-ment" and "focal parenchymal expansion" by Lasocki et al. [10]."Bulky gray matter involvement positive" lesions in Fig. 2 and Supplementary Fig. 1 all exhibit either a focal expansion or involvement of the gray matter, such as basal ganglia or the mesial temporal cortex, while "Bulky gray matter involvement negative" lesions are T2WI-high intensity lesions expanding into the white matter with little gray matter involvement.

Around the rim of contrast-enhancement (Around-rim)
This feature was evaluated on Gd-T1WI and was considered positive if the target of interest localized right outside of the rim of contrast-enhancement, demarcating the contrast-enhancing lesion.This feature was incorporated into our scoring system, referring to the report showing that glioblastomas with a relatively narrow pericystic rim suggest limited infiltration to the surrounding neuropil [20].Several examples are demonstrated in Fig. 2 and Supplementary Fig. 2. 3. High-intensity on T1WI and low-intensity on T2WI (HighT1LowT2) This imaging feature derives from the "the mild T2/ FLAIR hyperintensity" documented by Lasocki et al. [10].A semi-quantitative approach was introduced in this research to further render this imaging feature more reproducible.As further described, two representative points were selected on the MRI for each patient.The first point was selected within the contrast-enhanced lesion referring to Gd-T1WI (Ref_1 in Fig. 1b), and the second point within the T2 high-intensity area without "Bulky gray matter involvement" features (Ref_2 in Fig. 1b).In case the values on T2WI value from the second point was lower than that of the first, reselecting a new first or second point was allowed until the value on T2WI of the second point was higher than that of the first.The "High-intensity on T1WI and low-intensity on T2WI (HighT1LowT2)" feature was considered positive if the value of T1WI was higher than the mean value of two reference points on T1WI and the value of T2WI was lower than the mean value of two reference points on T2WI (Fig. 1b).Several examples are demonstrated in Fig. 2 and Supplementary Fig. 3.
The three imaging features mentioned above were evaluated by investigator 1 (SY), followed by a multiple linear regression analysis using each imaging feature as an independent variable and the T/N ratio of Met-PET as the dependent variable.This process allowed us to determine the coefficient of each imaging feature that correlated with the T/N ratio of Met-PET.The obtained coefficients were rounded to integers, which were used to construct a scoring system to identify nCET by qualitative assessment of MRI (Fig. 1d).The scoring system was named "nCET score," which will be used further throughout the manuscript.The classification performance of the scoring system was evaluated by area under the receiver operating characteristic curve (AUC) and the best cut-off value was calculated using the Youden index.

Assessment of classification accuracy using the new score
Next, two experienced surgeons (investigator 2: YO and 3: HA) were instructed by investigator 1 (SY) on the three imaging features mentioned above and trained themselves using MRI from additional four patients that were not used in the primary analysis.Subsequently, investigators 2 and 3 independently assessed all the prespecified targets on MRI, blinded to the magnitude of the Met-PET T/N ratio and the scoring system.The assessment results from investigators 2 and 3 were used to validate the diagnostic performance of the nCET score, along with the evaluation of interrater agreement by Cohen's kappa coefficient.

Statistical analysis
The Pearson correlation coefficient was calculated between the predicted Met-PET T/N ratio or nCET score and the actual Met-PET T/N ratio.Furthermore, differences in actual Met-PET T/N ratio as a function of nCET score were statistically assessed by the one-way analysis of variance (ANOVA) followed by a post hoc analysis using the Tukey's honest significant difference (HSD) test.
The area under the curve (AUC) for predicting nCET using the nCET scoring system was 0.79 (Fig. 3b).The Youden index revealed the best threshold of the scoring system to be 4. Thus, regions with nCET scores ≥ 4 would suggest nCET, while nCET scores ≤ 3 non-nCET.The sensitivity and specificity were 0.771 and 0.747, respectively.

Independent validation of the scoring system
The AUC of discriminating nCET from non-CET using the nCET scoring system by investigators 2 and 3 were 0.78 and 0.80, respectively (Fig. 4a, b), and the weighted kappa coefficient for the nCET score was 0.95 (Fig. 4c).The nCET score assessed by investigators 2 and 3 and the actual Met-PET T/N ratio correlated significantly (r = 0.52 and 0.59, p < 0.0001, Fig. 4d, e).The sensitivities and specificities of investigators 2 and 3 were 0.44 and 0.77, 0.92 and 0.77, respectively, using a threshold of 4 to accurately determine nCET with Met-PET's T/N > 1.5.Regarding the interrater agreement, Cohen's kappa coefficients for "Bulky gray matter involvement," "Around-rim," and "HighT1LowT2" were 0.73, 0.96, and 0.80, respectively.

The correlation of Met-PET T/N ratio and nCET score
The nCET score and the Met-PET T/N ratio showed a positive correlation for nCET score ≥ 2 (Fig. 5).Nearly all VOIs with nCET score ≤ 1 showed a Met-PET T/N ratio of less than 1.5 and more than 75% of the VOIs with nCET score = 2 showed a Met-PET T/N ratio of less than 1.5.In contrast, the probability of the VOI having a Met-PET T/N ratio ≥ 1.5 was more than 50% when the nCET score of a VOI was ≥ 4. Notably, these results were consistent among expert neurosurgeons.

Discussion
The primary resection target was historically the contrastenhanced lesion, and the relationship between the resection rate and prognosis was enthusiastically investigated [1,3].However, neurosurgeons have been shifting their focus on extending the target for resection into the T2/FLAIR highintensity lesions, referred to as "supratotal resection" [2,21].The significant difference between these two lesions is that almost all tissues in the contrast-enhanced lesion are composed of tumor cells and necrosis.In contrast, the T2/FLAIR high-intensity lesions are composed of a mixture of tumor and normal brain tissue [20], requiring complicated and difficult decisions to achieve optimal supratotal resection.
This research demonstrated that qualitative assessments of glioblastoma's MRI might be valuable for estimating tumor burden in high-intensity lesions on T2/FLAIR.Moreover, the current research is unique in that Met-PET was used as a reliable tumor burden surrogate to validate the robustness of the tested qualitative MRI features to identify nCET.This approach liberates the demand for biopsied tissue samples usually used as ground truth in similar research.This finding should benefit glioblastoma patient care.While some qualitative features that suggest locations with high tumor burden were already reported, there was no information comparing the magnitude of tumor burden and qualitative imaging [8][9][10].The newly developed nCET score is expected to facilitate neuroradiologists, neurosurgeons and radiation oncologists to determine surgical or radiation targets within T2/FLAIR high-intensity lesions, hopefully leading to better treatment outcomes for glioblastoma patients.
We and others have attempted to estimate the magnitude of tumor burden in T2/FLAIR high-intensity lesions using quantitative measurement and further using machine learning or deep learning [6, 11-13, 15-17, 22, 23].However, these techniques have yet to gain popularity in clinical settings, primarily due to a lack of easy access.Some qualitative image features have been reported to overcome this issue by taking advantage of their easy application into clinical practice [8][9][10].A review article by Lasocki et al. summarizes the potential qualitative MRI features useful to identify nCET and cites the following four key features: gray matter involvement, eccentric shape, relatively mild T2/FLAIR hyperintensity, and focal parenchymal expansion [10].Among these four features, we considered "the mild T2/FLAIR hyperintensity" and "the focal parenchymal expansion" features most important and incorporated into the nCET score features as the former feature is histologically verified [24], and the latter is reported to correlate with better prognosis [25].On the other hand, interrater agreement is a crucial factor in determining qualitative indicators' reliability.The Cohen's kappa coefficients for each of the three features in this study were over 0.7, and the weighted kappa coefficients for the nCET score were 0.95.These results suggest that well-trained neurosurgeons could easily and reliably adopt our proposed MRI features.However, it should be noted that the sensitivity and specificity for identifying nCET differed between the two observers (Fig. 4a,  b), with the threshold for nCET score being set as 4 according to the ROC curves of two independent observers.This observation implies that "hard to know" cases will remain as such even if our proposed qualitative image assessment is successfully incorporated.
While the "Bulky gray matter involvement" feature was already reported in a previous report [26], the "Around-rim" feature could also represent the magnitude of glioblastoma invasion of the brain.It is reported that glioblastomas with a relatively narrow pericystic rim show limited infiltration to the surrounding neuropil [20].This observation is consistent with our result showing a lower Met-PET T/N ratio in the area with a positive "Around-rim" feature.We also included a new image feature named the "HighT1LowT2", deriving from our previous observation that brain tissues with high tumor burden show a more normal appearing MRI intensity than those mainly composed of edematous change [27,28].Here, it is noteworthy that each feature correlated significantly with the Met-PET T/N ratio, and combining these features improved the prediction accuracy of T2/FLAIR highintensity lesions with high tumor burden.
This study has several limitations and issues to be discussed.First, although the nCET score was assessed by two neurosurgeons in different institutions, an extensive amount of data from various institutions is required for further external validation of our findings.Second, Met-PET was used as the ground truth representing nCET.Although Met-PET is reported as one of the most accurate imaging modalities to this means [11-13, 15-17, 29], image-guided stereotactic tissue sampling is the ideal method to provide definitive evidence.Finally, this investigation consisted of one investigator exploring imaging features accompanied by two different readers for validation.With any kind of qualitative image assessment, a generalized protocol for imaging feature reading must be further established in a critical manner with external validation.

Fig. 3 Fig. 4
Fig.3The correlation between the predicted and actual methionine PET T/N ratio assessed by investigator 1 is presented.A linear regression model shows a significant correlation between the two a.The linear regression model's performance for classifying high and low methionine PET T/N is shown b

Fig. 5
Fig. 5 Differences in methionine PET T/N ratio among each nCET score are shown.Black and red asterisks indicate significant differences between nCET scores scored by investigators 2 and 3, respectively