Our study was a two-arm, superiority, assessor-blinded, cluster randomized trial. The study lasted for 48 weeks, with an intervention time of 0 to 24 weeks and a follow-up period of 24 to 48 weeks. Participant characteristics were collected at baseline only. Exercise adherence of participants was collected at 4, 12, 24, 36, and 48 weeks. Secondary outcomes (KOA symptoms and knee function) were collected at 0, 24, and 48 weeks.
To avoid contamination within a community, randomization was performed at the community level instead of at the individual level. An independent researcher used the random number function in Excel to generate the randomization sequence. Study staff opened opaque envelopes with random numbers to obtain the community allocation.
Participants signed the informed consent and were informed of their assigned group and specific exercise intervention strategies. Therefore, participants were not blinded to the allocation of groups. Moreover, study staff were unmasked to the allocation of participants after community recruitment due to the differences in the exercise intervention programs. However, the assessor and the statistician were masked to the allocation of the participants.
Sample and setting
Community-dwelling older adults with KOA were recruited from 14 community centers in Beijing via print and social media advertisements from April to October 2018. The inclusion criteria were as follows: age ≥60 years, had experienced knee pain on most days within the past month, scored their average knee pain over the past week between 3 and 7 on an 11-point numeric rating scale, and showed intact cognitive functioning, as indicated by a score of 8–10 on the 0–10 point Short Portable Mental Status Questionnaire . The exclusion criteria were as follows: participants had undergone either a joint replacement or arthroscopic surgery on the affected side of the knee; had other lower-limb surgery within the past six months; showed evidence of severe deformity of the lower limbs (e.g., knee varus or valgus); exhibited other health issues that could induce adverse events during home exercise (e.g., uncontrolled high blood pressure, myocardial infarction, cerebral infarction, unstable angina, arrhythmia, severe vision problems, or neurological dysfunction); or had other regular exercise habits (at least three days a week of no less than 30 min of exercise per day).
General stage (week 0-2)
Participants in the intervention group entered the general stage after their baseline data were collected. The goals of this period for participants were to (i) correctly learn to perform home exercise; (ii) fully understand the basic knowledge of KOA and the benefits of exercise; (iii) advance from a stage of pre-action (pre-contemplation, contemplation, and preparation) to a stage of action. Participants attended three two-hour group activities carried out by physiotherapists over two weeks. Each activity included an hour for group health education and another hour for exercise. The educational materials distributed to the participants included home exercise manuals and a printed version of the health education slides.
The exercise program was created based on literature review, clinical practice, and expert consultation. The exercise program had previously been proven to be effective to improve both symptoms and function of older adults with KOA . It involved ten movements and was recommended to be practiced for 30-40 minutes per day on at least three days per week [Additional file 1].
Group health education was conducted by physiotherapists and was designed to increase participants’ awareness of exercise by explaining and discussing the severity of KOA and the benefits of exercise. It involves three parts that cover the concepts of 1) clinical signs, risk factors, treatment, and nursing care for KOA; 2) the advantages and principles of exercise; and 3) final information related to routine daily care for KOA.
Stage-specific period (weeks 3–24)
In the stage-specific period, participants of each community were divided into two subgroups, including the pre-action stage subgroup and the action stage subgroup. Each subgroup had different intervention goals, and group activities were conducted separately. During this period, six group activities were held at week 4, 8, 12, 16, 20, and 24 (i.e., every four weeks) and each activity lasted about 2 h. The participants were required to participate in all six group activities. If a participant did not participate in a group activity for some reason, we supplemented the contents during the next group activity for him/her. Prior to every group activity, participants were re-assessed regarding their stage of change via phone or WeChat by research assistants and assigned to different subgroups, as necessary. The stage of change of the participants was assessed by the Questionnaire for Stage of Exercise Change. This 5-point scale developed by Marcus et al.  places the individuals in one of the following stages of change: pre-contemplation, contemplation, preparation, action, or maintenance. Therefore, members of each subgroup were assigned/reassigned based on the participants’ exercise conditions over the past four weeks, rather than being fixed into subgroups. Physiotherapists delivered TTM-based stage-matched interventions to the participants in each subgroup. Our study included a total of five physiotherapists who were the main interveners. They were responsible for exercise guidance and TTM interventions. They each had ≥5 years of musculoskeletal clinical experience and were given at least 2 h of training on home-based exercise programs. They also completed 6 h of training about intervention strategies and techniques based on TTM. A description of the core objectives, TTM-based strategies (mainly based on ten processes of change), and the recommended form of interventions at each stage are shown in [Additional file 2]. At weeks 4 and 12, physiotherapists conducted two review sessions to ensure that participants continued to correctly perform home-exercises.
Participants in the control group received usual exercise guidance without any exercise adherence interventions. At baseline, weeks 1 and 2, physiotherapists carried out a total of three home-exercise guidance sessions to ensure that participants were able to exercise at home correctly and safely, and to teach exercise precautions. Weeks 4 and 12 of the exercise review classes and the assessments were the same as in the intervention group. The content of the exercise guidance and the prescribed exercise type and intensity were exactly the same as in the intervention group.
All study outcomes were collected during group activities at the corresponding time. Participant characteristics (at baseline); exercise adherence (at week 4, 12, 24, 36, and 48); and KOA symptoms (including pain intensity and joint stiffness at baseline, week 24 and 48) were collected through paper questionnaires. Knee functions (lower limb muscle strength and balance at baseline, week 24 and 48) were collected based on results of knee function tests.
Baseline participant characteristics were obtained using a demographics questionnaire developed specifically for this study. The questionnaire included questions on age, sex, height and weight, marital status, educational level, occupation before retirement, residence, disease duration, comorbidities, and current drug use.
Primary outcome measure
The primary outcome of the present study was exercise adherence of the participants. Exercise adherence was measured using an 11-point numeric rating scale (with 0 indicating not at all through and 10 indicating completely as instructed) at week 4, 12, 24, 36, and 48 . The scale contains only one entry “Please rate your exercise adherence according to your performance with respect to the number of times of practice, quality of actions, and duration of each practice in the recent period” If the participant wanted to evaluate his/her exercise adherence as 10 points, he/she were required to exercise 3–5 times a week for at least 30 min each time. The scale’s intra-class correlation coefficient was 0.77 when assessing exercise adherence among other populations with musculoskeletal disorders, which has proven to have an acceptable reliability .
Secondary outcome measures
The secondary outcomes of our study included KOA symptoms, including pain intensity and joint stiffness, as well as knee function (lower limb muscle strength and balance), which were collected at baseline, week 24 and 48.
KOA-related pain intensity and joint stiffness were measured by the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) . It includes seven items related to pain and joint stiffness rated on a 0-4 Likert scale, where higher scores indicate greater pain and stiffness. The internal reliability of the Chinese version of the WOMAC, as measured by Cronbach’s α, is 0.67-0.82 across its two subscales. In addition, its test-retest reliability, based on the intra-class correlation coefficient, is 0.82-0.88 for its two subscales .
The adjusted total scores of pain intensity and joint stiffness ranged from 0 to 100, which were calculated from the raw ratings of the total scores as follows:
Raw Rating (RR)
Adjusted Score (AS)=——————————×100
the total scores
The muscle strength of the lower limbs was determined by the Five-Times-Sit-to-Stand Test (FTSST), which requires participants to rise from a chair and return to a seated position, with their arms folded across their chests five times as quickly as possible. Participants completed this exercise twice, with a 1 min rest period between each trial. The mean value of the two trials was used . Participants’ balance was measured via the Timed Up and Go Test (TUG), which measures the time it takes participants to rise from a standard height chair, walk 3 m, turn around, return to the chair, and sit down .
Community nurses recruited older adults diagnosed with KOA from 14 community centers. Doctors screened the participants according to the inclusion and exclusion criteria to determine the eligibility for participation. Next, participants signed the informed consent forms and completed the baseline assessments. Data were collected by three assessors and the assessors were blinded to the group assignments.
Ethical approval was obtained from the Peking University Biomedical Ethics Committee (IRB00001052-17066) in July 2017. All participants voluntarily participated, and could withdraw at any time without negative consequences. Each participant completed written informed consent. The data collected were anonymized and kept confidential and were used exclusively for the present study.
We used the two-sample t-test power analysis for sample size calculation. The primary outcome was the difference in the exercise adherence score between the intervention group and control group at 24 weeks. Determining the mean difference (1.1) and SD (2) between the groups was based on the results of our pilot study and a relevant search on exercise interventions . Power analysis was carried out with α=0.05, β=0.2, and with the intervention and control groups having the same sample size. According to the Power Analysis and Sample Size software (PASS 2008, NCSS Corporation), 50 participants were required per group. Considering that the experimental design is a cluster randomized controlled trial (RCT), relevant factors within the community were taken into account and applied to the formula N=[1+(m-1) ρ] n, where N is the sample size of the cluster RCT, n is the sample size of the individual RCT, m is the number of individuals in the predicted community, and ρ is the intra-group correlation coefficient . In this study, m=15 was expected. According to the literature review [43, 44], we calculated ρ=0.03 and N=142; taking into account a probable 15% loss to follow-up, the total sample size was calculated as 168 cases with 84 cases in each group.
We used the intention-to-treat analysis method. Data were analyzed using SPSS version 25.0 (IBM Corporation, Armonk, NY, USA). We considered a p-value of ≤0.05 (two-sided) to indicate statistical significance. Descriptive statistics such as means and SDs, medians, interquartile ranges [IQR], frequencies, and percentages were calculated to indicate demographic and disease characteristics and outcome scores. Inferential statistics including an independent t-test and repeated measurement ANOVA were also used to analyze the data.
The repeated measures ANOVA was achieved by a general linear model. We used Mauchly’s test of sphericity to check whether the data fit the statistical assumptions for conducting repeated measures ANOVA. When the data did not satisfy the spherical assumption (e.g., p < 0.05), the epsilon correction coefficient was used to correct the degree of freedom. When the results of repeated measures ANOVA showed that there was no interaction of group*time, the group main effect was used to judge the difference between groups. If there was an interaction of group*time, it meant that the data of the two groups had different trends with time. It was not possible to judge the difference between groups by repeated measures ANOVA, so we used an independent t-test to compare the data at each time point to determine the difference between groups. In addition, we used one-way repeated measures ANOVA to test for differences among time points within the group.
In addition, for the primary outcome of exercise adherence, we used a combination of repeated measures ANOVA and latent growth model (LGM) for comprehensive analysis. Through repeated measures ANOVA, changes and differences in the mean of exercise adherence between the two groups during intervention and follow-up could be analyzed. LGM could further analyze the change rate of exercise adherence over time and quantify the difference in the growth rate in exercise adherence between the two groups, while considering the interindividual variation to better elucidate longitudinal stability and change  and evaluating the efficacy of TTM-HEI. This is because effective intervention programs could not only increase the population mean but also reduce interindividual variation. Thus, most participants in the intervention group could progress in a concentrated manner according to the expected stage of behavior change, thereby reducing the chance of regression and stagnation. The specific method is as follows:
There are two latent variables in LGM, which are labeled as the intercept and the slope, respectively. The intercept reflects the initial level of variables and is often restricted as an equal constant. The slope reflects the change rate of variables across time and is commonly restricted to a series of constants as linear, nonlinear, or freely estimated. In a freely estimated model, slope is restricted the first time and at the second or the last time with a value of zero and a value of one, respectively . In LGM, the variation of two parameters could be analyzed by the variances or residual variances, representing the interindividual variation in the initial level and growth rate.
To better analyze the differences of exercise adherence between two groups over time, we constructed the models using two steps. First, we analyzed the effect of the groups on the initial level (intercept) and change rate (slope) by Model 1 conducting group as a covariate (0: control group, 1: intervention group). Second, we conducted multiple-group analysis to analyze the characteristics of two groups by Model 2 (control group) and Model 3 (intervention group). In the three models, the value of one was restricted to all intercepts, and the value of zero and value of one were restricted to the slope at the first time point (week 4) and the last time point (week 48), respectively, considering that the change rates of exercise adherence was unknown.
The parameters of LGM in the present study were estimated using Maximum Likelihood (ML) with 2000-replication bootstrapping to obtain stable and unbiased parameters . Model fit indices were the chi-square to degree-of-freedom ratio (c2/df; with values <3 and <5, indicating good and adequate fit, respectively), the Standardized Root Mean Square Residual (SRMR; with value below 0.08 indicating reasonable fit), and the Comparative Fit Index (CFI; with values ≥0.90 indicating acceptable level for model fit) . Model fit indices in all three models were all acceptable with | c2/df = 4.001, SRMR = 0.032, CFI = 0.966 in Model 1, |c2/df = 5.904, SRMR = 0.070, CFI = 0.849 in Model 2, and |c2/df = 3.074, SRMR = 0.067, CFI = 0.939 in Model 3.