Study design
We performed a longitudinal RCT measuring the effectiveness of a three-component multidisciplinary 1-year LS in women with PCOS and overweight or obesity. This study was approved by the Medical Research Ethics Committee of the Erasmus MC in Rotterdam; reference number MEC 2008-337 and registered at the Dutch Trial registration: reference number NTR2450. The current study on eating behavior represents an analysis of a secondary outcome. The results of the primary outcome and the design of the intervention have been described previously (34, 35).
Participants
We conducted this randomized controlled trial at the Division of Reproductive Endocrinology and Infertility, Department of Obstetrics and Gynecology of the Erasmus MC, Rotterdam, the Netherlands. Women were eligible if they were diagnosed with PCOS according to the Rotterdam 2003 consensus criteria, had a BMI above 25 kg/m², between 18 and 38 years old and would like to become pregnant. Women with inadequate command of the Dutch language, severe mental illness, obesity with another somatic cause, ovarian tumors that lead to an androgen excess, adrenal diseases, had other malformations of their internal genitalia or who were pregnant, were not eligible for the study.
At baseline, and at 3-, 6-, 9- and 12-months all participants attended the outpatient clinic for standardized screening and all outcome measures were assessed. This screening included a family and reproductive history, anthropomorphometric and ultra-sonographic assessments. Participants also completed the DEBQ, EDEQ and BDI-II questionnaires at all these time points.
Lifestyle intervention (LS)
The lifestyle treatment aimed at 1) changing cognitions by cognitive behavioral therapy (CBT); 2) developing healthy dietary habits; 3) encouraging and promoting physical daily activity, and; 4) activating social support. The intervention consisted of 20 group sessions of 2.5 hours carried out by a multidisciplinary team. The first 1.5 hours of every group session was supervised by a basic psychologist/CBT trainer and a dietician. The last hour of each session was supervised by two physical therapists. The Dutch Food Guide was used as a guideline for a healthy diet and daily amounts for the different food groups (36). Participants were advised to make small changes in their daily life according to this guideline. No caloric restriction was advised. More information about which CBT techniques were used at each session and information about the daily amounts according to the Dutch Food Guide were described in the study protocol (34). Drop-out is a well-known problem in lifestyle programs, therefore we used an outreach approach to motivate participants to come to the group meetings, unless the participant indicated to withdraw from the study. Participants were called or emailed several times when they were not present during a group-meeting to motivate them to come to the next meeting.
Lifestyle intervention with additional Short Message Service (LS with SMS)
After 3 months of LS, half of the participants in the LS received additional support by tailored SMS via their mobile phone. Participants sent weekly self-monitored information regarding their diet, physical activity and emotions by SMS to the psychologist. Participants received feedback on their messages to provide social support, encourage positive behavior and empower behavioral strategies. Besides, participants received two messages per week addressing eating behavior.
Care as usual (CAU, control group)
The CAU group had 4 short, unstructured consultations with their treating physician during the standardized screenings at our outpatient clinic at 3, 6, 9 and 12 months. Participants in the CAU group were encouraged to lose weight through publicly available services (i.e. diets, visiting a dietician, going to the gym or participating in public programs such as Weight Watchers®). The physician also mentioned the risk of overweight for both mother and child, and the relation between overweight and fertility.
Randomization
Participants who were assigned to either: 1) 20 CBT lifestyle group sessions including 9 months of electronic feedback through Short Message Service (SMS) via their mobile phone (LS with SMS) 2) 20 CBT lifestyle group sessions (LS without SMS) or 3) to the control group who received usual care (CAU). Written informed consent was obtained from all participants before the study. At baseline, participants were randomized at a 1:1:1 ratio using a computer-generated random numbers table by a research nurse.
Outcomes
The DEBQ (15) was used to assess eating in response to diffuse emotions (diffuse), eating in response to clearly labeled emotions (emotional eating), eating in response to the sight or smell of food (external eating), and eating less than desired to lose or maintain body weight (dietary restraint). This questionnaire consists of 33 items measuring 4 subscales. The subscale scores range between 1 and 5, with a higher score reflecting a higher degree of the relevant eating behavior.
The EDEQ (37, 38) was used to measure specific eating disorders. This questionnaire consists of 36 items measuring 4 subscales: restraint, shape concerns, weight concerns, eating concerns, and a global score. The subscale scores range between 0 and 6. A higher score indicates more severe eating psychopathology. A global score or subscale score of 4 or higher is considered clinically significant. In women with PCOS, a mean EDEQ score of 2.38 has been reported compared to 1.29 in the general population (28).
The Beck Depression Inventory (BDI-II) is a validated and widely used questionnaire in depression trials assessing the severity of depressive symptoms over the previous 2 weeks, according to the DSM-5 criteria. The BDI-II is a 21-item self-report questionnaire with items rated on a 4-point scale (0–3) and are summed to give a total score (range 0–63). A higher score on the BDI-II denotes more severe depression. Scores of 0–13 indicate minimal depression, 14–19 (mild depression), 20–28 (moderate depression) and 29–63 (severe depression) (39).
Statistical considerations
The power calculation was based on the primary outcome of the LS intervention: weight (kg). The method described by Aberson (25) was applied, with a power of 0.90, a 2-sided alpha of 0.025 (corrected for the interim analysis as described in the study protocol) and 5 repeated measures linearly decreasing. All variables were analyzed based on the intention-to-treat population, defined as all allocated participants. Multilevel or mixed regression modeling was applied for longitudinal outcomes. Mixed modeling can efficiently deal with missing data and unbalanced time-points (40, 41). This means that, additionally, patients without complete follow-ups could be included in the analyses, without imputation. Study group, linear and logarithmic time and interactions were included as independent variables. The deviance statistic (42) using restricted maximum likelihood (43) was applied to determine the covariance structure thus taking into account the situation when e.g. the deviation at baseline is different from the deviations at follow-ups. In the case of a non-normal distribution, a bootstrap procedure with 10,000 samples was performed to obtain a more reliable outcome. The bootstrap mixed model analyses were performed utilizing IBM Corp (Released 2017. IBM SPSS Statistics for Windows, Version 25.0. Armonk, NY: IBM Corp).
To test if weight, depression, androgens, insulin, the homeostatic model assessment for insulin resistance (HOMA-IR) and cortisol mediated the effect of LS on eating behavior, we used multilevel longitudinal mediation or indirect effect analyses. Paths a, b, t and t¢ were estimated employing multilevel regression analyses. Firstly, we determined whether paths b were significant. When path b was not significant, mediation is unlikely. We adjusted the Sobel-Goodman test for the indirect effect of the independent variable on the dependent variable as reported by MacKinnon and Dwyer (44) following the recommendations by Krull and MacKinnon (45) for multilevel mediation analyses. The significance of the mediated effect is given by:

(46).
Cohen's d effect sizes were calculated by dividing the differences between time-point and baseline estimations by the estimated baseline standard deviation. The guidelines of Cohen were used: effect sizes of 0.20 were considered as small, 0.50 as medium and 0.80 as large (47). p-values < 0.05 were considered significant.