The Importance of the Change Point Recognition in the Landmark Analysis of Immuno-oncology Clinical Trial with the Delayed Treatment Effect

Background: The proportional hazards (PH) assumption is often not met in the IO clinical trials due to the delayed effect. The landmark analysis could be performed as sensitivity analysis to evaluate the ecacy of the treatment despite the impact of the violation of PH assumption, but analysis results can vary according to the different landmarks. The goal of the paper was to raise the awareness in both the statistical and clinical communities regarding the importance of the change point in the IO delayed effect clinical trials, in order for the improvement of the existing landmark analysis method. Methods: Pre-dene the change point as an objective choice of landmark time of landmark analysis was recommended for the IO trials with delayed effect to avoid additional biases caused by the choice of landmark. We called this method change point landmark analysis (CPLA). Monte Carlo simulation was implemented to explore test power and type I error of CPLA. A simulated example was also conducted to evaluate the advantages of change point recognition on the landmark analysis. Results: CPLA showed the high power comparing to the earlier landmark choices and the later landmark choices, especially when the impact of the delayed effect was earlier than the median survival time. The type I errors of CPLA are also well controlled. Conclusions: the change point is a good objective choice of landmark in landmark analysis, which could resolve the objectively pre-dened challenge which is the critical requirement of the landmark analysis.


Background
In recent years, due to the innovative medical advancement, new treatments such as immunotherapies have yielded long term survivors. The development of immuno-oncology (IO) new drugs or the investigation of new indications for the IO anti-cancer drugs already on the market have become the drug research hotspots.
A time to event endpoint is the primary outcome in many clinical trials, especially oncology drug development. The most commonly used statistical methods for time to event analyses have been the logrank test [1] and Cox regression [2]. In both cases, the performance of analysis depends on the proportional hazards (PH) assumption. Generally speaking, the PH assumption is often not met in IO clinical trials. IO agents can have effect on both the human immune system and the tumor microenvironment, the effect of IO agent is not typically directed to the tumor itself, it instead boosts or releases the brake from the patient immune system, and this positive effect may not be observed immediately, therefore the delayed separation with a change point (CP) of the Kaplan-Meier curves [3] have often been observed in the IO therapies trials, potentially resulting in a violation of PH assumption [4][5][6]. The published simulation study showed that the conventional study design with exponential assumption could lead to an underestimation of statistical power in the presence of delayed clinical effect when designing randomized clinical studies with immunotherapies [4]. Therefore, the delayed effect type survival curve observed in the IO clinical trials, consequently pose unique challenges to the trial design and analysis method. Can we recognize the change point when survival curves cross in IO clinical trials and specify the optimal development strategy? How can we make use of this kind of delayed effect with change point information for the statistical analysis?
In the survival analysis setting, landmark analysis refers to the practice of designating a time point occurring during the follow-up period (known as the landmark time) and analyzing only those subjects who have survived until the landmark time [7]. That is to say, Once the landmark has been chosen, any ineligible subjects would have been excluded, and subjects have been classi ed according to their status at the landmark time, the usual survival analysis methods are applied, and the conclusions are only generalizable to subjects who have survived until the landmark time. Analysis results can vary according to the different landmarks. The hazard ratio estimations are conditional with a different target at each landmark. Therefore, the choice of landmark is a critically important consideration. To avoid additional biases caused by the choice of landmark, the landmark should be selected a priori, based on some clinically signi cant natural time before the start of data analysis [8].
The goal of the paper was to raise the awareness in both the statistical and clinical communities regarding the importance of the change point in the IO delayed effect clinical trials, in order for the improvement of the existing landmark analysis method.

Methods
The Change Point of the Survival Curves A change point is the location where the distribution abruptly changes in a data sequence [9].Due to the mechanism of the IO anti-cancer drugs, the delayed separation of Kaplan-Meier survival curves is usually observed, sometimes the curves are converged before the change point (Figure1A), sometimes the curves are very close (Figure1B) before the change point. When the delayed separation of survival curve is present, it violates the fundamental study design assumption of the proportional hazards, and also results in a potential loss of statistical power to demonstrate the difference between two treatment groups because of the long invalid period before the separation change point.

Landmark analysis
With the exponential distribution assumption, the survival function is de ned as: where λ 0 , group and t g are baseline hazard, treatment group and the time point indicator, t CP is the time of change point. The hazard ratio before change point and after change point under the exponential distribution in Figure 1A should be estimated separately as: For the landmark analysis, only the subjects survived in the period t g >t CP would be involved in the analysis. We recommend to pre-de ne the change point as an objective choice of landmark time of landmark analysis for the IO trials with delayed effect to avoid additional biases caused by the choice of landmark. We called this method change point landmark analysis (CPLA) in our research, and then change point was named as change point landmark in CPLA.

Simulation studies
Two simulation studies were conducted in our research.

Power and type I error evaluations
The rst simulation study was performed to evaluate test power (scenarios A and B) and type I error (scenario C) of the change point landmark analysis. We conducted Monte Carlo simulations for landmark analyses and traditional full data Cox regression using delayed effect type survival data with change point at CP=2.5 month, three landmarks 1, 2.5 and 3.5months were considered, one was before the true change point, one was after the true change point. Equal sample sizes (N 1 =N 2 =20,30,40) simulated data were generated. A simulation of time-to-event data was performed based on a randomly censored model [10,11], we generated individual lifetime X following the survival functions in Table1. To test the power of landmark analyses, the scenarios A were the simulated situations for the survival curves as Figure 1A, and the scenarios B were for Figure 1B. Different median survival times, different hazard ratios in the before and after change point periods were also considered in the power tests. Additionally, the scenario C was two overlapping survival curves to test type I errors. The censoring time S in two samples was generated from uniform distributions U (0, a) and U (0, b), where varying the values of a and b may result in censoring rates of approximately 10%, 20%, or 30% in the two samples. Because the lifetime X followed different distributions in each group, it was necessary for the values of a and b to be unequal to keep the average censoring rates in each group approximately equal to the given censoring rates. Each individual was assigned an observed survival time T=min(X, S) and an event indicator Δ=I [X≤S]. The exact power and size of test statistics were estimated by determining the proportion of samples for which the null hypothesis was rejected at the α = 0.05 signi cance level, based on 1000 simulations.

Simulated clinical trial
We performed simulations of trial designs to evaluate the advantages of change point recognition on the landmark analysis. Our initial simulations use the similar scenario of Chen [4] and Korn [12]: 680 patients are accrued and randomly assigned 1:1 over 34 months and nal analysis occurred at 48 months after the rst patient was randomly assigned. If there is no delay in treatment effect, this design has 90% power to detect a hazard ratio (HR) of 0.75 (using a two-sided 0.05-level log-rank test) with the 512 events in the nal analysis, and there is no loss to follow-up. As is well known, the sample size and study duration are always xed according to the sponsor's budget during the study design, then we xed the total duration as 48 months accordingly. Because it is also hard to know the actual treatment effect delay length in advance, to assess the impact of the delayed clinical effect, we considered the 3 scenarios which covered 1/12, 1/8, 1/6 delay of the total study duration, then the change points were at the 4 th month, the 6 th months and the 8 th months (Figure 2), where the hazard ratio was 1.0 before the change points and 0.75 thereafter. As the whole study duration was pre-de ned as 48 months, the observed events were decreased for the impact of the delayed effect. No interim analysis was considered in the simulated trials. Each scenario was evaluated based on 1000 simulations, the power and mean of the conditional HRs with 95% con dence interval (CI) were calculated for change point landmark, other landmarks and full data Cox regression.

Results
In the rst simulation study, comparing the power of landmark analyses when prede ned landmarks as 2.5 months, 1 month and 3.5 months , the CPLAs (2.5 months) in the scenarios A have the highest power despite the impact of sample sizes and censor rates ( Table 2), even when the change point is close to the median survival time 2.8 months in the scenarios A1, or when the HR is 0.8 after change point in the scenarios A3. The P values distribution of the change point landmark (2.5 months) is also distinct lower than the other landmarks from 0 to 6 months ( Figure 3). In scenario B1, when the change point is close to the median survival time, the power of the CPLAs does not show the stable highest power. But In scenarios B2 and B3, when the change points have a certain distance from the median survival time, the CPLAs keep the highest power comparing to the other landmarks, even when the HR is little bit larger in scenario B3 (Table 3).
Meanwhile, the power of Cox regression is obvious lower as the PH assumption is violated by the delayed effect (Table 2 and Table3). On the other hand, the type I errors of the change point landmark analyses are well controlled, as well as the other landmarks (Table 4).
In the second simulated clinical trial, although the sample size and the event numbers in the nal analyses are lower than the designed setting due to the delayed effect under xed study duration, the power of true change point landmarks are still around 90%, but the estimations of conditional HR are slightly higher than 0.75 -the pre-speci ed value. The earlier landmarks (e.g. 4 months landmark in 6 months delayed setting, 4 months and 6 months landmarks in 8 months delayed setting) have the obvious higher power than the later landmarks, but still lower than the true change point landmarks (Table 5).

Discussion
The possibility of a delayed treatment effect has become more relevant to IO cancer trials, and several trials have been demonstrated to show that [13,14] In this article, the landmark analysis was recommended to be applied in IO cancer trials with delayed effects and crossed survival curves. But the choice of landmark time is very critical because this method omits the time-to-event distribution before landmark time, different landmark times estimate different conditional hazard ratios and may suffer from loss of power with misclassi cation [8]. Therefore, the arbitrary choice of landmark may be questioned [22]. We recommended to use the change point of the survival curves for the objective choice of the landmark for the IO cancer trials.
From simulation studies, CPLA showed the highest power comparing to the earlier landmark choices and the later landmark choices, especially when the impact of the delayed effect was earlier than the median survival time. Interestingly, it was reported that the landmark method is most powerful when the "risk altering intervening event" occurs comparatively early and the outcomes of interest are not particularly common at this early study point [8], the recognition of the change point of the delayed effect in the IO cancer trials is a useful approach to gure out the earliest timepoint of the "risk altering intervening event" from statistical point of view. Furthermore, the landmark should be selected in advance to safeguard against the danger of a data-driven decision, the change point could be objectively pre-de ned as the landmark in the statistical analysis plan. Therefore, it is worth to gure out the change point objectively based on the crossed delayed effect survival data.
Statistical methodology such as weighted log-rank test (WLR), Weighted Kaplan-Meier test et al [23,24], also exit to account for the delayed separation of Kaplan-Meier curves. However, landmark analysis is easy to execute and understandable for the clinicians, this method keeps having its superiorities and applied values.
It is worth noting that the CPLA may not show the treatment advantages if the change point is near to the mediation survival time, but this kind of longer delayed effect is not common in the IO clinical trials. If there is evidence supporting a very long delayed effect, it is better to consider that as the part of the      Landmark analyses P values for scenario A2 (simple size N1=N2=40, censor rate 10%)