Background: Trismus is caused by impaired function of masticatory muscles. Routine delineation of these muscles during planning may improve dose tracking and facilitate dose reduction resulting in decreased radiation-related trismus. This study aimed to compare a deep learning model vs. a commercial atlas-based model for fast auto-segmentation of the masticatory muscles on head and neck computed tomography (CT) images.
Material and methods: Paired masseter (M), temporalis (T), medial and lateral pterygoid (MP, LP) muscles were manually segmented on 56 CT images. CT images were randomly divided into training (n=27) and validation (n=29) cohorts. Two methods were used for automatic delineation of masticatory muscles (MMs): Deep learning auto-segmentation (DLAS) and atlas-based auto-segmentation (ABAS). Quantitative assessment of automatic versus manually segmented contours were performed using Dice similarity coefficient (DSC), recall, precision, Hausdorff distance (HD), HD95, and mean surface distance (MSD). The interobserver variability in manual segmentation of MMs was also evaluated. Differences in dose (∆Dose) to MMs for DLAS and ABAS segmentations were assessed. A paired t-test was used to compare the geometric and dosimetric difference between DLAS and ABAS methods.
Results: DLAS outperformed ABAS in delineating all MMs (p < 0.05). The DLAS mean DSC for M, T, MP, and LP ranged between 0.83±0.03 to 0.89±0.02, the ABAS mean DSC ranged between 0.79±0.05 to 0.85±0.04. The mean value for recall, precision, HD, HD95, MSD also improved with DLAS for auto-segmentation and were close to the mean interobserver variation. With few exceptions, ∆D99%, ∆D95%, ∆D50%, and ∆D1% for all structures were below 10% for DLAS and ABAS and had no detectable statistical difference (P >0.05). DLAS based contours have dose endpoints more closely matched with that of the manually segmented when compared with ABAS.
Conclusions: DLAS auto-segmentation of masticatory muscles for the head and neck radiotherapy had improved segmentation accuracy compared with ABAS with no qualitative difference in dosimetric endpoints compared to manually segmented contours.