A novel TPS toolkit to assess correlation between transit fluence dosimetry and DVH metrics for adaptive head and neck radiotherapy

Inter-fractional anatomical variations in head and neck (H&N) cancer patients can lead to clinically significant dosimetric changes. Adaptive re-planning should thus commence to negate any potential over-dosage to organs-at-risk (OAR), as well as potential under-dosage to target lesions. The aim of this study is to explore the correlation between transit fluence, as measured at an electronic portal imaging device (EPID), and dose volume histogram (DVH) metrics to target and OAR structures in a simulated environment. Planning data of eight patients that have previously undergone adaptive radiotherapy for H&N cancer using volumetric modulated arc therapy (VMAT) at the Royal Adelaide Hospital were selected for this study. Through delivering the original treatment plan to both the planning and rescan CTs of these eight patients, predicted electronic portal images (EPIs) and DVH metrics corresponding to each data set were extracted using a novel RayStation script. A weighted projection mask was developed for target and OAR structures through considering the intra-angle overlap between fluence and structure contours projected onto the EPIs. The correlation between change in transit fluence and planning target volume (PTV) D98 and spinal cord D0.03cc with and without the weighting mask applied was investigated. PTV D98 was strongly correlated with mean fluence percentage difference both with and without the weighting mask applied (RMask = 0.69, RNo Mask = 0.79, N = 14, p < 0.05), where spinal cord D0.03cc exhibited a weak correlation (RMask = 0.35, RNo Mask = 0.53, N = 7, p > 0.05) however this result was not statistically significant. The simulation toolkit developed in this work provided a useful means to investigate the relationship between change in transit fluence and change in key dosimetric parameters for H&N cancer patients.


Introduction
Anatomical variations in head and neck (H&N) cancer patients can lead to clinically significant plan deterioration and adaptive radiotherapy (ART) may be required to restore an optimal dose distribution [1]. Past literature has explored the benefits of implementing ART workflows within a clinic for H&N cancer and has mostly been focused on improving dosimetry in parotid glands, which is crucial when lowering risk of xerostomia [1,2]. Monitoring the dose delivered to the spinal cord, via evaluation of kV cone-beam computed tomography (CBCT), has also shown to be beneficial in answering the golden question: when should we replan? [3] Traditional methods to implement ART into the clinic are associated with a significant increase in the clinical workload to the radiotherapy department as a whole. As a result, there has been a recent push to develop an accessible and automated quantitative trigger for ART, with one common method being transit dosimetry with an electronic portal imaging device, EPID [4]. Using relative EPID dosimetry, one can explore the gradual change in transit fluence over the course of treatment without necessarily needing to rely on the absolute precision and accuracy of the images themselves. Change in transit fluence is expected to be correlated with anatomical variations and patient setup errors, which 1 3 has thus far been the main focus of research on this topic. Several groups have tried to correlate dosimetric impacts with change in transit fluence, where 2D relative gamma analysis is commonly utilized [5][6][7][8]. Through conducting gamma analysis on the transit EPID images acquired over the course of treatment, a variety of parameters can be extracted to help quantify change in dose to the patient.
Through utilising the mean gamma value, mean , from a 2D 3%/3 mm relative gamma analysis test on whole electronic portal images (EPIs), Piron et al. [5] concluded that change in transit fluence could be used as a predictor for plan deterioration for H&N cancer patients as a result of anatomical variations. Utilising the whole EPI, however, could result in misleading mean gamma values. If only a small area of pixels included large deviations, the EPI as a whole may still be similar to the baseline EPI and thus result in a score that is below the trigger threshold, ultimately decreasing sensitivity. By considering projections of regions of interest onto the EPIs, the sensitivity of the analysis may be improved.
By projecting planning target volume's (PTV's) onto EPIs obtained every fraction and correlating mean with dosimetric changes (V95%), Piron et al. [6] found that projecting PTVs onto the EPIs and then conducting gamma analysis improved sensitivity to anatomical changes. However, projecting organs-at-risk (OARs) onto the EPIs were not considered.
The same group then went on to establish an action threshold for H&N ART, and proposed a threshold of mean > 0.42 , as evaluated using the whole EPI [7]. By considering a dosimetric threshold of V100% < 90% the group was able to analyse the sensitivity and specificity of the threshold proposed. Moreover, the group also explored the correlation between mean dosimetric differences of PTV and OARs with mean gamma values of the whole EPI for patients that did reach the action threshold, as well as patients that did not. A strong correlation between change in spinal cord dose and mean was not observed, likely due to the inherent nature of the gamma analysis test conducted on whole EPIs-being more sensitive in high dose regions, such as PTVs, than lower dose regions, such as OARs.
Lim et al. [9] explored the correlation between change in transit fluence, in a generalised rectangular region surrounding the neck, and volumetric change of a ROI ( ΔV ROI ) spanning from the Condyloid process (jaw) to C6 of the spinal cord. Volumetric change, which is a good predictor for grade 2 xerostomia [10], was found to be strongly correlated with change in transit fluence (R = −0.776, p < 0.001) . A 5% threshold in ΔV ROI could be used as a trigger for ART, where the area under the receiver-operating characteristic curve (ROC) was determined to be 0.88. This study did not investigate the potential improvement in sensitivity by projecting the ROI onto the EPID.
When utilising Linac-measured EPIs for relative gamma analysis, one of the largest sources of systematic error in these types of studies include the accuracy of the first fraction EPI. The results of these studies all rely on the assumption that the patient anatomy at the time of the first fraction EPI is representative of the patient anatomy at the time of the planning CT (pCT). A poor baseline could be misleading and yield results with mean values significantly lower than actually representative of the change since pCT. The research presented in this study differs from previous studies by predicting transit EPIs using an in-house script developed in the RayStation treatment planning system (TPS) by Ray-Search Laboratories, rather than analysing Linac-measured EPIs. The advantage of this approach is in the removal of any patient set-up errors, as well as any anatomical variation in the patient between obtaining the pCT and first fraction baseline EPI. Rather than using weekly CBCTs to obtain multiple EPIs over the course of treatment, this study will also only consider the pCT and rescan CT (rCT) to avoid any uncertainties associated with deformable image registration of the pCT's to CBCT's, or dose calculation uncertainty on CBCT. Considering these factors, the developed tool allows for the investigation of correlation between change in transit fluence and change in patient DVH metrics in a more controlled environment. Moreover, current triggers for ART at the Royal Adelaide Hospital are based on visual assessment of tightness of mask fit and contour change on set-up images by treating radiation therapists. This method is both highly subjective and has poor specificity. The developed tool is useful in assessing sensitivity and specificity of transit dosimetry as a quantitative trigger for ART.

Patient selection
Human research ethics and research governance approval was obtained for the study. The radiotherapy datasets (treatment plans, planning CTs and RT structure sets) of eight patients previously having undergone ART for H&N cancer at the RAH were collected and anonymized. Each patient consented to their data being used for research purposes and had at least 1 rescan CT acquired over the course of treatment.

EPID model
The EPID model developed in this work was generated to represent EPI's measured with a Varian TrueBeam aS1200 MV imager in Portal Dosimetry mode. Square field Portal Dosimetry images were collected to provide beam profiles and output factors with and without solid water in place and the EPID at 150 cm source to detector distance (SDD). A simple EPID model was constructed in RayStation (8B) to model the Varian TrueBeam aS1200 MV imager operating in Portal Dosimetry mode. The model consisted of a 40 cm × 40 cm × 5 cm water slab atop a 40 cm × 40 cm × 4 cm lead slab. Dose planes were extracted at 3.6 cm depth in the water slab. The selected water slab thickness and extraction depth were guided by Varian Portal Dosimetry calibration settings. The thickness of the lead slab was selected based on a comparison of measured and computed output factors. Calculated EPIs simulated in RayStation at 150 cm SDD were first downscaled to 100 cm SDD, where flood field and beam profile corrections were applied, and subsequently scaled back to 150 cm SDD. The flood field correction was obtained by dividing all calculated EPID dose planes by an open field calculation that covered the EPID dose plane area. This simulates the flood field correction of the actual EPID. The beam profile correction was obtained by multiplying the flood field corrected array with a 40 cm × 40 cm 2 radial beam profile measured at 3.6 cm in water. This again simulates the beam profile correction applied by the Varian Portal Dosimetry software. This was done to mimic the major corrections applied in forming an EPI in portal dosimetry mode on the Linac.

EPI extraction
The RayStation EPID template model was imported as regions of interest into the original treatment plan of a given patient within RayStation. By utilising a number of functions within the RayStation python scripting environment, an automated sequence of steps was programmed to rotate the EPID model around isocenter for each control point of a VMAT plan as outlined in Fig. 1. The EPID model was first positioned to have the 3.6 cm extraction depth positioned 50.0 cm below the origin. This was followed by rotation to a given control point, and subsequent translation to be centred at the isocenter. Once positioned appropriately, the dose for that VMAT control point was calculated. The dose delivered to the extraction plane within the EPID model was then stored into a 200 × 200 pixel array, with a pixel size of 2.0 mm, and subsequently integrated over every control point of the VMAT plan. To help optimise the time required to extract the EPIs from n beams, the dose from all beams at a given gantry angle was calculated via a collapsed cone algorithm, rather than rotating the model around the patient n times. Thus, the generated image is a sum of all beams in a fraction, rather than per-beam images. Once the original treatment plan had been simulated using the pCT, the process was repeated using the rCT to obtain a second EPI.

VMAT script validation
To validate our model, a 20.0 × 20.0 × 10.0 cm thick solid water slab was positioned at 90.0 cm SSD on a Varian True-Beam Linac, with the EPID positioned at 150.0 cm SDD. A VMAT treatment plan was then delivered, and a baseline EPI was extracted. To simulate change in patient anatomy, 2.0 cm of the solid water slab was removed anteriorly and the same VMAT treatment plan was delivered to extract a second fraction EPI. The change between the second EPI and the baseline was then calculated as a 2D percentage difference map, relative to dose maximum of the baseline. This process was then repeated using the simulation toolkit developed in RayStation to obtain a predicted 2D percentage difference map.

ROI projection mask
Past research has shown that projecting ROI's onto the EPIs can lead to improved sensitivity in using change in transit fluence as a quantitative trigger for ART. The contours of the spinal cord and PTV's for each patient included in the study, at time of planning, were thus extracted through utilising DICOM RT structure files. These contours could then be projected onto the extraction plane of the EPI with the use of modules within the dicompyler-core Python package [11]. A ROI projection mask was then created, taking the thickness of the structure in the beam's-eye-view into consideration to weight the mask by the path length through the structure as shown in Fig. 2a.
In this work, the spinal cord was chosen as a significant ROI alongside the PTV's as plan adaption for H&N patients at the Royal Adelaide Hospital is often triggered by the monitoring of spinal cord doses during on-course plan dosimetry assessment. However, the methods outlined in this work can still be applied to a variety of different structures, such as the parotid glands when considering toxicities such as xerostomia [9,10]. If a plan contained multiple PTV volumes with different dose levels, each was treated as an independent ROI in the correlation analysis. The evaluation PTV contours (PTV_EVAL), as calculated via taking the Boolean subtraction of sequentially higher dose PTV volumes from the current PTV volume, were utilised as these are the volumes that undergo DVH assessment during treatment planning.
To further evolve the ROI projection mask, one can consider the overlap between the open field of the MLC configuration and the projection of the structure of interest. Thus, the DICOM plan file was also used to extract these coordinates on the EPI to create a 'fluence projection' mask. EPI pixels that lay within the open field projection were given a value of 1, and those outside the open field were given a value of 0. The mask could then further be weighted by the monitor units (MU's) delivered to the open field region for a given control point through multiplying the mask through by the cumulative meterset value and beam MU's found in the DICOM RT plan file as shown in Fig. 2b. The resulting mask could then be integrated along each control point to obtain a final intra-angle projection mask as shown in Fig. 2e, which highlights the regions within the EPI for which dose was delivered to a particular structure of interest, weighted by the structure's volume and MU's delivered.

DVH parameters
Change in transit fluence alone cannot be used as a reliable trigger for ART without first understanding how this quantity relates to dosimetric differences within the patient. Current workflows at the Royal Adelaide Hospital involve an on-course dosimetry monitoring program making use of RayStation's CBCT dose calculation functionality. Evaluations are performed to ensure target coverage does not diminish by a certain amount and that serial organs at risk, such as the spinal cord, do not exceed tolerance values. Replanning thresholds are assessed on a case-by-case basis Fig. 2 Process used to derive an intra-angle ROI and fluence projection mask. For each control point, a the structure projection mask and b MU-weighted fluence projection from each beam is used to obtain d the overlap between the two, as visualised by c. This process is repeated for each control point to achieve e the final intra-angle ROI and fluence mask by the consulting Radiation Oncologists with a major consideration being the number of remaining fractions. In this study DVH metrics considered for the spinal cord and PTV_ EVAL structures relate to the near maximum and minimum doses received by the volumes, being the D0.03cc and D98 metric respectively. In this work, the change in these metrics over the two CT datasets (original planning CT and rescan CT) was considered and any correlation between them and change in transit fluence was explored.

Correlation
As previously mentioned, a correlation between change in transit fluence and change in D0.03cc or D98, for the spinal cord and PTV's respectively, would show that change in transit fluence can be used as a trigger for ART. There have been multiple approaches in quantifying this change in transit fluence to be used as a trigger for ART, where gamma parameters are commonly utilised such as max , mean , and 1% . One disadvantage of this approach is that gamma values are always positive and thus give no information regarding the direction of such change. A second fraction EPI that has received a greater amount of dose than the baseline may show similar corresponding gamma parameters to a second fraction EPI that has received less dose than the baseline. It was thus decided that the mean percentage difference between the two EPIs should be calculated, relative to the baseline, as this will also provide directional information. It should be noted, however, that signed percentage differences in a region of interest of an integrated image may cancel over multiple control points and lead to lower-than-expected mean values. Care should thus be taken when interpreting these values via first assessing the 2D percentage difference map.
A correlation between the change in D0.03cc and D98, for the spinal cord and PTV_EVAL's, and mean fluence percentage difference with and without the respective masks applied was then explored. Noting that with no mask applied, the entire 40 × 40 cm EPIs were utilised, whilst disregarding doses < 10% of local maximum so that summary statistics are not skewed by clinically irrelevant small doses. Statistical analysis was conducted through considering Pearson's correlation coefficient ( R ), Spearman's rank correlation coefficient ( ), and a linear regression with a 95% confidence interval. The statistical significance of the correlation coefficients was also calculated for the given sample size, where the difference between the correlation with and without the projection masks applied was also considered to explore any improvements in sensitivity that the mask may supply.

Non-transit verification
The EPID model used in this work was developed through consideration of beam profiles, output factors, and gamma analysis between predicted and measured EPIs.  Fig. 4, showing percentage differences all less than 0.8%.

EPID model validation
The EPID model developed was first validated through simulating change in patient anatomy via anteriorly removing 2.0 cm of a solid water slab between two deliveries of a VMAT plan. Changes in the Linac-measured EPIs were compared to changes in the TPS-calculated EPIs through comparing percentage differences,Δ % , in beam profiles In the x-profiles shown, a maximum percentage difference of 7.2% and 7.7% was observed for the predicted and true changes in EPIs, respectively.
Similarly, for the y-profiles, a maximum percentage difference of 8.7% and 9.0% were observed for the predicted and true change in EPIs. The mean percentage error between the true and predicted percentage differences across the whole image was 1.49%, ranging between a maximum of

No weighting mask applied
The correlation between change in transit fluence and change in DVH metrics for the eight patient datasets was explored. D98 PTVs and D0.03cc for the spinal cord was first explored with no weighting masks applied-utilising the entire 40 × 40 cm EPIs whilst disregarding doses < 10% of local maximum. Figure 7a shows the relationship between mean transit fluence percentage difference and change in D98 for the PTVs, with a linear regression band fitted to a 95% confidence interval. Pearson's and Spearman's correlation coefficients were determined to be 0.79 and 0.82 respectively, indicating a strong correlation exists. Similarly, Fig. 7b shows the relationship between change in D0.03cc for the spinal cord and mean transit fluence percentage difference, also with a linear regression band fitted to a 95% confidence interval. Pearson's correlation coefficient of 0.53 indicates a moderate correlation exists between the two parameters, as well as Spearman's correlation coefficient of 0.32.

Weighting masks applied
ROI projection mask To observe the individual benefits of various contributions to the overall weighting mask applied, the effects of projecting the ROIs onto the EPIs were first considered. Figure 7c d show the correlation between change in D98 and D0.03cc with mean weighted fluence percent-age difference, respectively. Pearson's and Spearman's correlation coefficients suggest that strong and moderate correlations exist for the PTVs and spinal cords, respectively, yielding values of 0.70 and 0.67 for the PTVs and 0.41 and 0.32 for the spinal cords. The slope of the linear regression fit for both the PTVs and spinal cords increase when utilising the ROI projection mask, in comparison to no weighting mask applied. This result suggests that the weighting mask applied increases the overall sensitivity of the fit, as expected.

Intra-angle ROI and fluence projection mask
The final level of complexity added to the weighting projection mask includes an intra-angle convolution between a fluence projection mask, through considering the MLC leaf configuration, and ROI projection mask, through considering the position of some ROI on the EPI. By considering the overlap between the fluence and ROI projections, one is able to consider the regions on the EPI for which some structure received dose at any point during the treatment delivery. Figure 7e and f show the correlation between change in D98 and D0.03cc with mean weighted fluence percentage difference for the PTVs and spinal cords, respectively. Pearson's and Spearman's correlation coefficients of 0.69 and 0.72 for the PTVs, and 0.35 and 0.21 for the spinal cords, suggest that a strong correlation exists for D98 and mean weighted percentage difference, and a weak correlation exists with D0.03cc and mean weighted percentage difference. In comparison to the correlation data with the ROI projection masks applied, Pearson's and Spearman's correlation coefficients increase for the cases of the PTV projections and decrease in the case of utilising the spinal cord projections. All correlation coefficients decreased relative to the correlation data with no mask applied, however the sensitivity of the correlation with the intra-angle

Discussion
Change in transit fluence, as measured via mean percentage difference, was found to be strongly correlated with change in PTV D98 and moderately correlated with change in spinal cord D0.03cc with no weighting mask applied. Pearson's and Spearman's correlation coefficients were found to decrease when applying a ROI projection mask to the predicted EPIs, however still yielding strong and moderate correlations, respectively. No statistically significant difference between the correlation coefficients was observed between the two sets ( p PTV = 0.32 and p SpCord = 0.41 ), however the slope of the regression was found to increase by 93% and 80% for the PTVs and spinal cords, respectively. Taking the weighting mask through another layer of complexity via an intra-angle convolution with the MLC fluence projections also resulted in no statistically significant difference in the correlation coefficients calculated, relative to no weighting mask applied. The slope of the linear regression fit, however, increased by 113% and 196% for the PTVs and spinal cords, respectively; greatly improving the sensitivity of change in transit fluence to a change in key DVH metrics. Future work should consider the specificity of the model, through evaluating some threshold value to trigger the need for plan adaptation, via a receiver operating characteristic (ROC) curve.
This study is consistent with past literature on improving sensitivity via projecting PTVs onto the EPI. For comparison, Piron et al. [6] explored V95% with mean gamma values across the entire EPI as well as by projecting PTV's. Via a meta-analysis of this study, it was found that the correlation between V95% and mean gamma increased from ~ 0.55 to ~ 0.70 when projecting the PTV, as well as the slope of the trend line increasing by ~ 270%. Overall, projecting the PTVs onto the EPI was found to be beneficial to both the correlation coefficient and sensitivity, which agrees with the results presented.
In this work, patient data was selected on the basis that plan adaptation occurred due to significant morphological changes within the patient, such as tissue shrinkage, resulting in some regions receiving greater than anticipated dose and others lesser. One of the eight patients used for this research, however, was replanned due to significant changes in neck tilt; resulting in poor patient alignment in the clinic. In this scenario, there was poor agreement in the position of PTV's and Spinal Cord between the pCT and rCT and as a result led to unreliable ΔD98 and ΔD0.03cc values being calculated for the PTV's and spinal cord, respectively. For this reason, the patient data was omitted from the correlation analysis as the change in datasets reflected an intentional change in patient positioning which would not be encountered when delivering the same plan at different fractions. The poor agreement of this dataset with the proposed correlation does however indicate that the model may also be useful in identifying patient setup errors.
When exploring the correlation between change in transit fluence and ΔD0.03cc for the spinal cords, it was observed that a much poorer correlation was found. The sensitivity, however, of the spinal cord DVH metric was drastically improved with the implementation of the weighting mask derived, as demonstrated by the 196% slope increase in Fig. 7f relative to Fig. 7b. The y-intercept of the linear regressions, however, pose another issue likely due to the small sample size used. y-Intercepts of up to 3.38% were observed for the spinal cord D0.03cc correlations, suggesting that patients showing a mean (weighted) fluence percentage difference of zero still have some change in dosimetry. Moreover, the 95% confidence interval of the linear regression fit for all three correlations explored do not include the origin of the graph-suggesting some systematic error is present within the data obtained. In the future, more datapoints, particularly for the spinal cord results, should be obtained to yield more reliable results. A forced y-intercept could also be explored, however addressing the systematic error would be the preferential approach to minimising this issue. Furthermore, analysing change in EPIs obtained in portal dosimetry mode has some inherent limitations in that the intra-angle differences may cancel out in the final integrated image. Conducting analysis in the continuous cine mode may thus improve correlation in the future, however it should be noted further work would be required to validate this acquisition mode. In relation to current limitations of the EPID model, it is hypothesised that discrepancies between change in measured and calculated EPIs lies in the beam hardening that occurs when a phantom is placed in the beam path in conjunction with the strong energy dependence of amorphous silicon detectors; this effect is not included in our current EPID model within the TPS.
It is important to note that the DVH metric considered for the spinal cord, D0.03cc, considers the near-maximum dose received by the spinal cord and is thus derived from very few voxels. The mean percentage difference, on the other hand, considers the entire volume. D0.03cc is thus much more sensitive to slight changes in the position of the structure, and as a result is unlikely to be well correlated with mean percentage difference. The pCT and rCT's utilised are registered as close as possible, however there are still some slight discrepancies in the anatomy between the two-giving rise to somewhat unreliable D0.03cc values. When considering DVH metrics that utilise the entire volume of the structure, such as mean dose and D98, it is reasonable to expect a good correlation. Future work should thus explore the correlation between mean weighted percentage difference and mean dose received by the parotid glands to explore side effects associated with high parotid gland dose from H&N radiotherapy such as xerostomia.
For the purposes of this study, pCT's were utilised to avoid any uncertainties associated with deformable image registration of the pCT to CBCT, and dose calculation uncertainty on CBCT. This, however, introduces a sample size limitation, as one patient plan will only yield one data point per adaptation to be used in the correlation study. In the future, the use of CBCT's would improve the overall sample size obtained, where the gradual change in patient anatomy could also be explored. This has the added benefit in that data points early into the treatment plan would likely include fractions which do not necessarily require adaptation, thus enabling us to explore various action threshold values for deciding when to replan. The sensitivity and specificity of each threshold value could be explored through consideration of a ROC curve and the area under the curve (AUC). The simulation toolkit developed would ensure that all changes in transit fluence are a direct result of change in patient anatomy, rather than incorrect positioning, and thus still be of use in assessing a critical threshold value. The toolkit can also be applied to multiple regions of interest within a treatment plan, and can be modified to the clinic's specific tolerance values of, for example, dose to the spinal cord or parotid glands for H&N cancer plans. The intraangle convolution mask between the ROI and fluence projections showed to drastically improve the sensitivity of any regression fits and is thus deemed to be useful in assessing organs at risk.

Conclusion
The simulation toolkit developed provides a useful method to explore the correlation between change in transit fluence and change in dosimetry for particular regions of interest. The toolkit was capable of predicting change in transit fluence accurately, where a weighting mask allows the user to consider particular regions on the EPID to improve sensitivity. Change in D98 was strongly correlated with change in mean weighted percentage fluence when considering the PTV's, however change in D0.03cc was only moderately correlated for the spinal cord OARs investigated in this study.
Funding No funding was received for conducting this study.

Data availability Not applicable.
Code availability Not applicable.

Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Ethical approval Approval was obtained from the Central Adelaide Local Health Network Human Research Ethics Committee (CALHN HREC). The procedures used in this study adhere to the tenets of the Declaration of Helsinki.
Informed consent Each participant has given consent to participate, via the statement "I do approve the use of my radiation therapy records for retrospective medical auditing and/or research".