Standardization of knowledge-based volumetric modulated arc therapy planning with a multi-institution model (broad model) to improve prostate cancer treatment quality

Purpose: To evaluate whether knowledge-based volumetric modulated arc therapy plans for prostate cancer with a multi-institution model (broad model) are clinically useful and effective as a standardization method. Methods: A knowledge-based planning (KBP) model was trained with 561 prostate VMAT plans from five institutions with different contouring and planning policies. Five clinical plans at each institution were reoptimized with the broad and single institution model, and the dosimetric parameters and relationship between Dmean and the overlapping volume (rectum or bladder and target) were compared. Results: The differences between the broad and single institution models in the dosimetric parameters for V50, V80, V90, and Dmean were: rectum; 9.5% ± 10.3%, 3.3% ± 1.5%, 1.7% ± 1.6%, and 3.6% ± 3.6%, (p < 0.001), bladder; 8.7% ± 12.8%, 1.5% ± 2.6%, 0.7% ± 2.4%, and 2.7% ± 4.6% (p < 0.02), respectively. The differences between the broad model and clinical plans were: rectum; 2.4% ± 4.6%, 1.7% ± 1.7%, 0.7% ± 2.4%, and 1.5% ± 2.0%, (p = 0.004, 0.015, 0.112, and 0.009) bladder; 2.9% ± 5.8%, 1.6% ± 1.9%, 0.9% ± 1.7%, and 1.1% ± 4.8%, (p < 0.018), respectively. Positive values indicate that the broad model has a lower value. Strong correlations were observed (p < 0.001) in the relationship between Dmean and the rectal and bladder volume overlapping with the target in the broad model (R = 0.815 and 0.891, respectively). The broad model had the smallest R2 of the three plans. Conclusions: KBP with the broad model is clinically effective and applicable as a standardization method at multiple institutions.


Introduction
Intensity modulated radiotherapy (IMRT) and volumetric modulated arc therapy (VMAT) can be used to create a steep dose gradient and complement dose distribution have been proposed such as replanning by sharing the dose volume histogram (DVH) information [6] and knowledgebased planning (KBP) [7][8][9][10][11]. The RapidPlan KBP system (Varian Medical Systems, Palo Alto, CA, USA) is an assistance system for optimization calculation using artificial intelligence technology. It is a system that automatically sets the line objectives to decrease organs' dose for new cases by learning clinical plans from more than 20 cases to achieve the predicted lower limit of the system. Therefore, RapidPlan has been used for standardization among institutions because it automatically produces line objectives for optimization without dependence on the planners' skill or experience. The system is less time consuming and has a more stable plan quality automatic setting of the objectives for targets and OARs. The performance of OAR sparing for the KBP system depends on library plans in the model; therefore, sharing single institution models is difficult, and standardization with the model is limited to related institutions [12,13]. In Australia, a multi-institution model trained by 110 clinical plans with different prescribed doses and contouring definitions collected from 10 institutions was developed and shared among the institutions to standardize the VMAT plan quality [7]. In the study, to train the multi-institution model, all library plans that passed their constraints were selected. Because different institutions usually have different planning policies, it is not realistic to use this multi-institution model that is trained with a single policy.
On the basis of the hypothesis that a KBP model can adapt to the clinical constraints of any institution by training with the clinical plans of various institutions that have different planning policies, a research group for KBP in Japan developed one broad model for prostate cancer that was trained with 561 clinical VMAT plans collected from five institutions with different structure definitions and planning policies [14]. There were no constraints other than the period for choosing the library plans in the broad model. The group compared the broad model and single institution model with two cases that were used for clinical treatment at one institution. By obtaining results comparable to those of the broad model and single institution model, the group indicated the possibility that the broad model could be used clinically at any institution. However, they did not evaluate whether the broad model is clinically acceptable at each institution.
The aim of this study was to evaluate whether the broad model is clinically useful and effective as a standardization method by comparing three plans calculated with the broad model, single institution model, and manual optimization at five institutions. This study indicates that KBP with the broad model is better than manual optimization and the single institution models, and it is an effective method for standardization among institutions with different plan policies.

KBP model preparations
In this study, five institutions (A-E) in Japan that treated patients with Stage I-III prostate cancers with VMAT and that have different contouring definitions and planning policies were identified. No cases after prostatectomy or cases with lymph nodes were enrolled in this study.  20,050). Each institution developed their own KBP model (single institution models) using their clinical plans from April 2019 to April 2020. The number of library plans in the single institution models were 34 50, 50, 50, and 60 at Institutions A, B, C, D, and E, respectively. The KBP plans with the single institution model were confirmed as clinically acceptable at each institution. Each institution has been using RapidPlan since 2017 and has improved the model. Each single institution model used in this study was created by learning the improved plans at each institution [15]. In the single institutional model, it was clinically recognized that a plan equivalent to a clinical plan can be created at each institution.
The broad model was developed by training 561 clinical plans, including 149 from Institution A, 150 from Institution B, 153 from Institution C, 49 from Institution D, and, 60 from Institution E, that were used clinically from April 2017 to April 2019 [14]. The broad model was verified with two cases to work normally using each institutions' calculation parameters. The goodness-of-fit for the regression models and outlier indexes in the broad models were previously investigated [14].

Comparison of dose distribution using KBP models
To evaluate the clinical usefulness of the broad model, the KBP with the single institution models and the broad model were performed at each institution on five clinical plans from April 2020. The median age and body mass index of the patients evaluated at all institutions were 70-78 years and 21.7-23.4 kg/m 2 , respectively. The contouring definition, machines, beams, and calculation parameters used in the clinical plans for each institution are shown in Table 1. Three plans, broad model, single institution model, and clinical plans, were created for each institution's case and compared, so the definition of contouring did not affect the results. In the optimization with RapidPlan, all institutions, except for Institution B, added upper objectives in addition to the line objectives to avoid high doses to the organs. In Institutions A, B, C, D, and E, the mean volume for the planning target volume (PTV) was 89.9, 81.8, 117.0, 71.0, and 109.4 cm 3 ; the rectum was 53.0, 37.3, 57.8, 36.5, and 34.7 cm 3 ; and the bladder was 235.9, 100.9, 113.4, 124.0, and 73.2 cm 3 , respectively.
The institutions had different prescribed methods as follows. For Institution A, the D mean for the PTV was 78 Gy / 39 fractions. For Institution B, the minimum dose to 50% of the reference volume (D 50 ) for the PTV was 70 Gy / 28 fractions. For Institution C, the minimum dose to 95% of the reference volume (D 95 ) for the PTV minus the rectum volume was 78 Gy / 39 fractions. For Institution D, the mean dose (D mean ) for the PTV is 78 Gy / 39 fractions. For Institution E, D 95 for the clinical target volume (CTV) was 76 Gy / 38 fractions.
At each institution, a CT image slice thickness of 2.0 mm and a field of view of 50 cm were applied. The optimization and calculation algorithm were Photon Optimizer 13.6 or 15.6 and the Acuros XB or Anisotropic Analytical Algorithm, respectively. In the KBP calculation with the broad model and single institution models, the same contouring definitions and machine, beam, and calculation parameters as the clinical plans were applied. To meet the clinical dose constraints of the prescribed volumes, lower and upper objectives for the target structures were adjusted, and the optimization was repeated several times at each institution.
From each DVH for the KBP plans and clinical plans, the minimum dose to 98% of the reference volume (D 98 ), D 95 , the minimum dose to 2% of the reference volume (D 2 ), and D mean in % were extracted. The conformity index (CI) were calculated with the following formula [16].
Where PIV is the prescription isodose volume, TV is the target volume, and TV PIV is target volume covered by prescription isodose volume. For the rectum and bladder, V 50 , V 80 , V 90 , and D mean were extracted from each DVH. V 50 , V 80 , and V 90 are the volume ratios receiving 50%, 80%, and 90% of the dose. The Pearson's correlation lines and values between D mean for the OARs and the ratio of the OAR overlapping volume with PTV to the whole organ volume (V overlap /V whole ) were calculated for the clinical plans and KBP plans at all institutions.

Statistical analysis
For statistical analyses, a paired Wilcoxon test was performed to identify differences in the plans created with the broad model and clinical plans or a single institute model. Pearson's correlation was considered weak for r < 0.4, moderate for 0.4 ≤ r ≤ 0.7, and strong for r > 0.7. All statistical analyses were conducted with SPSS version 24 (SPSS Inc, Chicago, IL, USA). A value of p < 0.05 was considered statistically significant.

Results
The dose distributions for the broad model, single institution model, and clinical plans for the case at Institution A are shown in Figure 1. The broad model plan was able to create a dose distribution comparable to other plans. In each plan, 95% and 90% isodose curves were comparable to the PTV and 50% isodose curves pass dorsal to the center of the rectum. In the 75% and 50% isodose curves, the broad model plan had a steeper dose gradient than other plans. Each DVH comparison for the PTV (B), rectum (C), and bladder (D) is shown in Figure 1. As for PTV, the lines of the three plans were almost the same. Compared with the remaining two lines, the broad model ran lower in the rectum and slightly higher in the bladder. The CI for each plan at each institution is shown in Table 2. CI for broad model in institution B and D were smaller than those for clinical plans although the means for Institutions A, C and E were not significantly different. It was confirmed that this inferiority was clinically acceptable at each institution.
In the PTV, the differences between the broad model and single institution model were as follows: D 98 , 0.7% ± 1.5%; D 95 , 0.2% ± 0.6%; D 2 , 0.0% ± 0.8%; D mean , − 0.3% ± 0.7%, respectively. The positive value indicates the values for the broad model were less than those for the other plans. The difference between the broad model and clinical plans were as follows: D 98 , 0.0% ± 0.6%; D 95 , 0.1% ± 0.2%; D 2 , − 0.7% ± 1.0%; D mean, 0.0% ± 0.4%, respectively. In the dose for the PTV, the difference between the broad model and other plans were small, although significant differences for D 98 and D 2 between the broad model and single institution model were observed.
In the rectum dose, the differences between the broad model and single institution model were as follows: V 50 , 9.5% ± 10.3%; V 80 , 3.3% ± 1.5%; V 90 , 1.7% ± 1.6%; D mean , 3.6% ± 3.6%, respectively. The positive values indicate values for the broad model were less than those for the other plans. The difference between the broad model and clinical plans were as follows: V 50 , 8.7% ± 12.8%; V 80 , 1.5% ± 2.6%; V 90 , 0.7% ± 2.4%; D mean, 2.7% ± 4.6%, respectively. Table 1 Contouring definitions, such as the gross tumor volume (GTV), clinical target volume (CTV), and planning target volume (PTV), and the rectum, machine, beam, and calculation parameters at each institution parameter for the broad model was significantly less than that for the single institution model and clinical plans. Figure 2 shows the box plots of dosimetric parameters for the rectum, such as V 80 (A) and D mean (B), in each plan. At Institutions A, B, and C, the V 80 values for the broad model were significantly smaller than those for the single institution model and clinical plans. In Institutions B and E, the D mean values for the broad model was significantly smaller than those for the single institution model and clinical plans. Figure 3 shows the box plots of dosimetric parameters for the bladder, such as V 80 (A) and D mean (B) in each plan. In Institutions C and E, the V 80 values for the broad model were significantly smaller than those for the single institution model and clinical plans. In Institutions D and E, D mean for the broad model was significantly smaller than those for the single institution model and clinical plans.
In Figure 4, the relationship of the coefficient of determination (R 2 ) between V overlap /V whole and D mean for the rectum (A) and bladder (B) for each institutional case and plan are shown. The correlation values of the D mean in % and the overlap volume for the rectum were as follows: broad model, 0.815; single institution model, 0.678; clinical plans, 0.780. The values for the bladder were as follows: broad model, 0.891; single institution model, 0.894; clinical plans, 0.706. All D mean in % for the rectum and bladder significantly correlated with the overlap volume (p < 0.001). The D mean of the broad model had a stronger correlation than 0.8 in the correlation with V overlap /V whole . Thus, the R 2 values of the regression lines for the rectum were as follows: broad model, 0.66; single institution model, 0.46; clinical plans, Each dosimetric parameter, except V 90 for the broad model, was significantly smaller than that for the single institution model and clinical plans.
In the bladder dose, the differences between the broad model and single institution model were as follows: V 50 , 2.4% ± 4.6%; V 80 , 1.7% ± 1.7%; V 90 , 0.7% ± 2.4%; D mean , 1.5% ± 2.0%, respectively. The positive value indicates the values for the broad model were less than those for the other plans. V 50 , V 80 , and V 90 for the broad model were significantly less than those for the clinical plans. The difference between the broad model and clinical plans were as follows: V 50 , 2.9% ± 5.8%; V 80 , 1.6% ± 1.9%; V 90 , 0.9% ± 1.7%; D mean , 1.1% ± 4.8%, respectively. Each dosimetric  . Each curve indicates each structure and isodose area. Red, cyan, brown, yellow, green, blue, light green, and pink refer to the PTV, bladder, rectum, 100% isodose curve, 95% isodose curve, 90% isodose curve, 75% isodose curve, and 50% isodose curve, respectively. Each dose volume histogram (DVH) comparison for each plan for the PTV (B), rectum (C), and bladder (D) organ dose the most. The clinical plan was able to reduce the organ dose more than the broad model plan in some institutions; however, the difference was small and the broad model plan was effective at all institutions. The broad model can be used clinically at any institution that has different planning policies and contouring design to reduce the variations of organ doses among institutions.
Rectal and bladder doses for the broad model plan were significantly less than those for the single institution model, as shown in Figure 2. Fukunaga et al. previously reported that these models had better quality than the single institution model [14]. Better plans can be created by learning all plans rather than by learning only excellent plans at each 0.61. The R 2 values for the bladder were as follows: broad model, 0.80; single institution model, 0.78; clinical plans, 0.51. The variations of D mean from the regression line of the broad model were the smallest of the three plans.

Discussion
In this study, we investigated whether the broad model trained on more than 500 plans can be used clinically for multiple institutions that have different planning policies and contouring design. The average data of all institutions indicated that the broad model plan was able to reduce the In the optimization with the RapidPlan, four institutions used only upper objectives in addition to line objectives to avoid high doses to organs rather than a complicated method of setting numerous objectives. Previous studies reported that upper objectives effectively reduce the maximum dose for the organs [13,19]. Setting many objectives complicates the optimization process and prevents standardization among planners and institutions. Optimization using the broad model is simple and effective for standardization in terms of computational methods.
To develop the broad model, clinical plans from five institutions that had different contouring definitions were chosen. The range of the mean volume for the PTV, rectum, and institution. Although it is necessary to learn high quality plans to create high quality models, the broad model was trained by more than 500 plans regardless of plans' quality. The RapidPlan predicts dose levels that can be reduced as much as possible based on the learned plans. Line objectives are automatically set to achieve the lower prediction level [17]. Therefore, there are two ways to create a better model: extract superior plans from clinical plans and learn them, or learn all plans, including the superior plans. In the former method, dose analysis is necessary to extract excellent plans from all clinical plans; however, this step requires dose analysis and is difficult [18]. other directions increases. In the VMAT for prostate cancer, the rectum and bladder are in positions opposite of the PTV, and if the dose of one is reduced, the dose of the other tends to increase. Therefore, the broad model may reduce the organ dose in a well-balanced manner. In the case of institution B, the dose to the bladder was forcibly lowered, so the dose to the rectum was likely to increase in the broad model. Retaining the bladder in the broad model leads to a reduction in the rectal dose, resulting in the effective functioning of the broad model.
In this study, we calculated the regression lines and difference between the overlapping volume and average dose of the organ to evaluate the plan variability. Moor et al. also used this relationship to quantify the variability of plans, and the difference from this regression line was an indicator of the variability of the plan quality [20]. The broad model had the highest R 2 among the three plans. Therefore, the plan output with the broad model can reduce the variation calculated between each institution. Panettieri et al. used a multi-institution model at multiple centers to assess whether clinical plans could be improved; however, the model could not assess whether inter-institution disparities could be reduced [7]. Figure 4 shows that the broad model is effective for standardization because the variability in organ doses at each institution can be suppressed by using broad model. bladder in this study was 71.0-117.0 cm 3 , 37.3-57.8 cm 3 , and 73.2-235.9.0 cm 3 , respectively. When using RapidPlan for cases that are not included in the trained plans in the models, such as cases that are too large for the rectal volume or PTVs that are too large for the model, the accuracy of the DVH prediction in a RapidPlan may decrease. Because the broad model can learn a variety of cases, RapidPlan can accurately predict DVH for new cases. It is difficult to enroll a wide range of cases at affiliated institutions with the same single institution model and contouring definition. As a result, these models are difficult to share among institutions with different contouring definitions. The broad model that we developed can be shared in institutions that include those with differences in contouring design. The broad model was not able to estimate doses lower than clinical plans at all institutions. At Institution D, the rectal dose was significantly less with the clinical plans than with the broad model, and the bladder dose was significantly greater with the clinical plans than with the broad model. The opposite trend was observed at Institution B with significantly lower rectal doses and higher bladder doses in the broad model. To reduce the organ dose in VMAT, it is necessary to keep the contribution from the beam passing through the organ low; however, the dose contribution from the The beam arrangements of the facilities participating in this study varied as shown in Table 1. Since the broad model showed a sufficient effect in all institutions in this study, it is unlikely that the differences shown in this study will have a large effect on the broad model. However, it is quite possible that the calculation cannot be performed well with these settings, but it has not been sufficiently examined in previous research and this research.
In Japan, the small-scale institution ratio for institutions with less than 200 new patients annually was nearly half of the total number of institutions according to a structural survey reported by Numasaki et al. [21]. The ratio is considerably greater than those in Australia, Korea, and the United States [22][23][24]. In Japan, where there are many small facilities, unification of plan design at each facility is not necessarily progressing, and there may be a broad difference in the quality of treatment plans. The broad model responded more consistently to overlapping volumes than other planning methods. Therefore, in some countries and regions, the broad model obtained from multiple institutions may be useful for planning standardization.

Conclusion
KBP with the broad model may provide a lower organ dose while providing an equivalent dose to the target compared with KBP plans using a single institution model and clinical manual plans, and provide lower dosimetric variations across multiple institutions. KBP with the broad model is clinically effective and could be used as standardization method at multiple institutions with different contouring definitions and planning policies.