The study of human brown adipose tissue has increased substantially over the last decade. There is great interest in assessing interventions intended to increase the magnitude or duration of BAT activity, which requires a standardized process for measuring BAT activity in humans. Several factors are highly important in measuring BAT activity in humans, including the choice of activation method, the consistency of ambient outdoor temperature, and the imaging modality used for BAT assessment22,23. A major step in the assessment of BAT activity is image segmentation, which includes selecting appropriate region thresholds, especially on FDG PET/CT, which is currently the gold standard in human BAT research17. However, since optimal BAT image analysis thresholds are not known, current available data on human BAT volume and activity are somewhat speculative and the comparability across studies is impeded by the use of different HU and SUV thresholds. In the present study we compared the BARCIST 1.0 BAT threshold recommendations with several other published thresholds used for BAT volume segmentation, as well as with the PERCIST 1.0 methods. We assessed activated BAT volumes using these methods on 56 FDG PET/CT images of young, healthy participants. We also performed a repeatability analysis of each method and additionally evaluated the images using a visual score.
We found a high degree of variability among thresholding methods, with the highest and lowest mean BAT volumes varying by nearly 700% using SUV1.0HU−180:−10 and SULPERBaHU−190:−10. PERCIST is not typically employed for BAT segmentation, but the variability is approximately the same between the 2 published BAT thresholds showing the highest and lowest BAT volumes, SUV1.0HU−180:−10 and SUV2.0HU−250:−50. Volumes of BAT reported in the literature vary greatly, even among studies enrolling similar populations and utilizing similar BAT activation protocols. For example, Hoeke et al and van der Lans et al both studied a population with a mean age of approximately 23 years and used an individualized cooling approach following a cold acclimation protocol and yet the reported mean BAT volumes differed by a factor of 324,25. There are several methodological issues that can impact volume differences, but segmentation/thresholding technique is likely a contributing factor17.
SUV Threshold. A general trend was seen of a lower SUV cutoff resulting in higher BAT volumes. This effect was compounded by a lack of adjustment for lean body mass with SUV1.0HU−180:−10 and SUV1.0HU−100:−10. These methods likely incorporated normal tissue, which is evident from the high visual assessment scores indicating an overestimation of BAT volume. The BARCIST 1.0 threshold combination was the only strongly repeatable method that also showed a visual assessment score near 0. The PERCIST follow-up threshold, combined with the BARCIST HU range, also displayed a strong visual assessment score, but was found to be poorly repeatable and showed a high amount of variance. A major strength of the BARCIST 1.0 method is an adjustment for lean body mass, which helps to individualize the SUV cutoff. This was also shown by Martinez-Tellez et al, who compared the effect of four threshold combinations on BAT activity quantification18. They found that the relative changes in BAT volumes were primarily influenced by the use of an SUV cutoff corrected for lean body mass.
Hounsfield Unit Threshold. This work shows that the inclusion of CT data, and the choice of HU range in BAT segmentation has a major impact on BAT volume. The PERCIST 1.0 cutoffs were tested, using the same images, with and without a HU threshold range applied. Without applying CT thresholds, the calculated BAT volumes were approximately 3-fold higher than BAT volumes calculated with applied CT thresholds. The only difference between SUV2.0HU−250:−10 and SUV2.0HU−250:−50 was the inclusion of voxels within the − 50 to -10 range in the latter and yet BAT volumes segmented using SUV2.0HU−250:−10 were about twice as high. The same pattern was seen with SUV1.5HU−180:−10 and SUV1.5HU−200:−50, possibly implying that voxels included in the higher end of the HU range are more important than those include from the lower end. It has been shown that BAT radiodensity can change following cold exposure, especially in the higher end of the HU range, which may help explain these results26.
Limitations. This study assessed BARCIST 1.0 thresholds along with several other published threshold combinations used for BAT quantification and PERCIST 1.0 cutoffs. Besides the selection of HU and SUV threshold combinations, quantification of human BAT volume and activity also depends on numerous other methodological issues such as the cooling protocol, FDG PET/CT methodology, segmentation software, intrinsic factors of the participants such as age, sex, or body composition, or extrinsic factors as outdoor temperature or daily light27,28. Thus, while some threshold combinations in this study, including BARCIST 1.0, showed strong repeatability, this alone may not indicate a useful threshold combination since the true repeatability is not known. Several thresholding schemes in this study produced similar BAT volumes but that may not indicate that the selected ROIs are reproducible. Future studies may benefit from using a voxel-by-voxel analysis to assess the degree of overlap among ROIs created using various thresholding methods. Future studies should also conduct sensitivity analyses with different thresholds in order to understand whether results are driven by the selected HU and/or SUV thresholds.