Participants and Data Acquisition
De-identified symptom rating data were collected from a naturalistic sample of 97 patients (mean age 53.76, SD = 13.20; 14% female) treated with clinical TMS at the VA Providence Healthcare System (Providence, RI, USA). Stimulation protocols followed clinical practice, with most patients receiving 10 Hz TMS delivered to the left dorsolateral prefrontal cortex. TMS was administered every weekday for up to 30 sessions, followed by six taper sessions. Symptom ratings were assessed using the Patient Health Questionnaire 9 (PHQ-9)  at baseline and after every five TMS treatment sessions. The PHQ-9 is a standard clinical rating scale for depression that provides adequate information about symptom severity and minimizes rater burden, with established psychometrics that facilitate its comparison with other depression rating scales. Scores range from 0 to 27, with higher scores indicative of more severe depression.
Group-level symptom rating data were also collected from published clinical trials investigating various TMS protocols. These studies included 1) a randomized noninferiority study comparing Theta Burst Stimulation (TBS) (n = 193) to standard 10 Hz stimulation (n = 192) , 2) a randomized controlled trial comparing standard and high-dose protocols of left-sided 10 Hz stimulation (standard n = 59, high-dose n = 91) and right-sided 1 Hz stimulation (standard n = 57, high-dose n = 93) , 3) a randomized controlled trial comparing supra- (120% motor threshold (MT)) and sub-threshold (80% MT) accelerated bilateral TBS (suprathreshold n = 87, subthreshold n = 93) to standard left-sided 10 Hz stimulation (n = 72) , 4) and an unblinded trial of accelerated, high-dose TBS using functional MRI guided targeting (n = 20) . These studies contribute heterogenous data incorporating a variety of TMS paradigms, supporting the generalizability and relevance of our analysis. Additional group-level symptom rating data were collected from a secondary analysis of the THREE-D Study  in which unique group-level treatment response trajectories were identified in subjects receiving either left-sided 10 Hz TMS or left-sided TBS (n = 388) . Primary outcome measures in these studies were the Hamilton Depression Rating Scale (HAM-D)  (17 item and 6 item), and the Quick Inventory of Depressive Symptoms Clinician-Rated (QIDS-C) . Longitudinal group-level mean symptom rating data were obtained from the included tables [9,27] or extracted from figures using http://www.graphreader.com/ [7,10,26].
The TMS treatment response was modeled using the exponential decay function in Eq. 1:
in which symptom ratings (D) at time t are described using the magnitude of total response (A), decaying at a time constant (B), and approaching a minimum value after treatment completion (C) as shown in Figure 1. In this model, parameters A and C are in the units of the symptom rating scale (i.e., score) and parameter B is time in weeks.
NLME models were constructed for our naturalistic sample receiving clinical TMS using Eq. 1. Model parameters were treated as random effects at the patient level and as fixed effects at the group level. A model treating all parameters (A, B, and C) as random effects was compared to the simpler model fitting only parameters A and C as random effects. The simpler model provided an equivalent fit as assessed by the likelihood ratio test (LRT) and lower Akaike information criterion (AIC) and Bayesian information criterion (BIC) values, and this simpler model was chosen. This NLME model was compared to a corresponding linear mixed-effect (LME) model in which slope and intercept were treated as random effects at the patient level using the AIC, BIC, and LRT. The predictive utility of this model was assessed using leave-one-out cross-validation (LOOCV), estimating the time constant (B) on a subset of the sample and using this estimate to calculate predicted treatment outcomes (C) for the left-out individual based on symptom rating scores at baseline and after five or ten TMS sessions.
Similar NLME models were constructed using the exponential decay function in Eq. 1 for three published group-level studies comparing various TMS protocols [9,26,27] with parameters A and C treated as random effects at the treatment group level. As above, these simpler models provided equivalent fits when compared to models using all parameters as random effects. These NLME models were subsequently compared to corresponding LME models in which the slope and intercept were treated as random effects at the group level using AIC, BIC, and LRT.
The exponential decay model was also fit at the group level using nonlinear least squares for each treatment group in the above studies [9,26,27] as well as the unblinded trial of accelerated, high-dose TBS . Corresponding linear models were compared to nonlinear exponential decay models using AIC and BIC.
Finally, a NLME model was constructed using the exponential decay function for the unique group-level symptom trajectories previously identified in 388 subjects receiving either left-sided 10 Hz TMS or left-sided TBS . Here, a model incorporating all parameters (A, B, C) as random effects yielded significant LRT and lower AIC and BIC values compared with a simpler NLME using only parameters A and C as random effects. This NLME model was then compared to a corresponding LME model in which slope and intercept were considered random effects at the group level using AIC, BIC, and LRT.
All mixed effect models were fit using maximum likelihood as implemented in the Nonlinear Mixed-Effects Models library in R .
This study was performed in accordance with the Declaration of Helsinki. The procedures for gathering and analyzing de-identified clinical data were approved as quality improvement by the VA Palo Alto/Stanford Institutional Review Board . Additional group-level analyses were based on publicly available data from previously published studies [7,9,10,26,27].