In this multi-site, naturalistic and explorative study, data was collected from 13 outpatient units within the Norwegian Network of Personality Disorders (32) in the period 2010 to 2016. All units were outpatient services on a specialist mental health service level, providing treatment for a broad range of patients with significant personality problems and personality disorder (PD). The different units combined psychoeducational, group and individual psychotherapy formats and treatment approaches were mainly psychodynamic, but combinations also included body awareness, art and cognitive therapies. Specific PD approaches implemented within some units in the Network during the investigation period, included mentalization-based therapy, dialectical behaviour therapy and schema-focused therapy. Treatment was usually time-limited, and most units had an upper time limit between two and three years.
The different treatment units collected patient data and the therapists’ self-report questionnaires (FWC-BV), which were registered in an anonymous central database, administrated by the Department for Personality Psychiatry, Oslo University Hospital.
In this study, all therapist data were anonymous and the number of therapists participating was unknown. However, some general information was available: The multidisciplinary therapist teams usually included psychiatrists, psychologists, psychiatric nurses and social workers. Most of the therapists are formally trained (for 3 to 5 years) in group analytic psychotherapy (32, 33). The Network regularly provides updated courses and conferences on PD assessment procedures and therapeutic principles (32, 34). Based on current information from 10 of the 13 units, the mean number of therapists at each unit was 10, approximately 75% female and 25 % male. Mean age was 45 years. Therapists had a mean age of 17 years of clinical experience, and 73% of the therapists had education in group psychotherapy. Group supervision of the therapists is traditionally an important element in treatment programs, and CT is part of the clinical discussions. The therapists filled out the FWC-BV at 6-month intervals for each patient they had in treatment (i.e., from 6 months up to 2.5 years), with a final assessment at end of the patient’s treatment. A total of 4849 FWC-BV were completed during the study period.
The sample consisted of 2425 adult patients. They were referred to treatment within the specialist mental health service on a regular basis, from a primary health service level. The mean age was 33 years (standard deviation [SD] = 10 years), and 76% of the patients were female. According to the guidelines given in DSM-IV (35), 71% of participants had one or more PD diagnosis and 94% had at least one symptom disorder, wherein 68% had mood disorders and 57% had anxiety disorders (see Table 1 for prevalence of PDs). The severity of PD is illustrated through different outcome measures: the Global Assessment of Functioning (GAF; APA, 1994) and Work and Adjustment Scale (WSAS) (36) measure patient psychosocial functioning; the Global Severity Index (GSI) measures symptom distress and is the mean score of the Revised Symptom Checklist-90 (SCL-90-R) (37); and the Index of Interpersonal Problems (IIP) measures interpersonal problems and is the mean score of the Circumplex of Interpersonal Problems (CIP) (38). The CIP is a revised version of the Inventory of Interpersonal Problems – Circumplex (IIP-C) (39).
In the current study, the mean GAF score was 49.77 (SD = 6.06), and according to APA (35) within the “Sever” range. Mean WSAS score was 22.60 (SD = 8.56), and according to Mataix-Cols and colleagues (40) and Pedersen and colleagues (41) in the “Moderate” range. In addition, the GSI was 1.54 (SD = 0.66) and IIP was 1.65 (SD = 0.52). With respect to GSI this score is in the “Moderate” to “Sever” range (37, 42, 43), and a score of 1.65 on CIP is associated with severe interpersonal distress (42, 44). Thus, all these measures reflect a poorly functioning patient group with a high level of symptom and interpersonal distress. TA was measured using the revised short form of the Working Alliance Inventory (WAI-SR). The patients filled out the WAI-SR at the same intervals as did the therapists when filling out the FWC-BV (i.e., every 6 months from 6 months up to 2.5 years in their treatment period, with a final assessment at the end of the treatment).
The FWC is a self-report measure in which therapists rate their emotional responses toward a patient in a five-point response format (0–4), ranging from ‘No such feeling’ (0) to ‘Very much’ (4). The present study uses a brief version (FWC-BV) of the Feeling Word Checklist 58 (FWC-58) that includes 12 items. In the FWC-BV, the prompt, ‘During recent conversations with the patient I have felt...’ is followed by the 12 feeling words: Disliked, Important, Threatened, Exalted, Bored, Confident, Inadequate, Admired, On Guard, Calm, Invaded and Overview. Each of the words is rated from 0 to 4 by therapists, based on how strongly they experience each feeling.
The FWC-BV is new and was constructed for this study with the aim of creating a more applicable and less time-consuming questionnaire, reflecting some positive and some negative feelings. The aim in its creation is to determine whether these positive and negative feelings are important cues to describe therapy processes and outcomes in future studies. The items were selected from the FWC-58. The selection was data-driven based on former factor analysis of the FWC-58 from a large heterogeneous material comprising data from different in- and outpatient clinics (12, 20). We selected the 12 items with the strongest loadings in the factor structure of the FWC-58; six items with positive feelings and six items with negative feelings. The 12 items were discussed and evaluated by an experienced researcher and clinician for clinical relevance in the present study.
All patients were diagnosed according to DSM-IV (35) using the Mini International Neuropsychiatric Interview (MINI) (45) for symptom disorders and the Structured Clinical Interview for DSM-IV – Axis II (SCID-II) for PD (46). Diagnostic reliability was not investigated. However, diagnostic assessments were performed in each unit by clinical staff who had received systematic training in diagnostic interviews and principles of the Longitudinal, Expert, All-Data (LEAD) procedure (47, 48). This means that diagnoses were based on all available information, including referral letters, self-reported history, complaints, overall clinical impression and the results of the two diagnostic interviews (i.e., the MINI and SCID-II). In DSM-IV, the classification of PD is polythetic—that is, the criteria within each disorder are neither necessary nor sufficient. The number of fulfilled PD criteria can thus be seen as a reflection of the dimensional strength or closeness to prototypic PD constructs.
The patients filled in the WAI-SR (49, 50) every six months during treatment and at discharge from treatment. The WAI-SR is a 12-item questionnaire representing 3 different aspects of the patient’s relationship to the therapist; bond, task and goal. Patients are asked to judge each question on a Likert scale from ‘Never’ (1) to ‘Always’ (7). The patients filled out two versions of the WAI-SR: one with reference to their group therapist (WAI-G) and one with reference to their individual therapist (WAI-I).
The data of the current study are based on ordinary routine assessments, but it is important to note that these routines sometimes are disturbed for one reason or another. Therapists may sometimes fail to fill in FWC-BV according to the time schedule, and administrative routines may be hampered so that patients do not receive six-month questionnaires. As such, the dataset in the current study is unbalanced. See Table 2 for an assessment of the FWC-BV and WAI-SR. To check for possible patient-therapist bias, patients who were evaluated on FWC-BV were compared to those patients not evaluated on FWC-BV at the time of 12 months of therapy. No significant differences were found on GAF, WSAS, GSI, CIP, or the number of fulfilled PD criteria. Thus, we found no indication of systematic bias threatening to the validity of the study results.
We decided to analyse the FWC-BV after 12 months in therapy, assuming that therapy is well underway by that point. There is usually also some delay from the initial assessment period to inclusion in the treatment programme, although most patients have individual clinical contact with the unit during this waiting time. As such, there is good reason to assume that the treatment process is stabilised one year after the initial assessment.
The total sample of 2425 patients was first randomly divided into 2 separate sub-samples. This was done to facilitate the exploratory (EFA) and confirmatory factor analysis (CFA). The first sub-sample (n = 1219) was used to conduct explorative factor analyses and the second (n = 1206) to cross-validate the suggested factor structure in a confirmatory factor analysis. After 1 year of therapy, the number of available completed FWC-BV questionnaires was 869. With respect to the initial factor analysis, sub-sample 1 (for EFA) comprised 439 FWC-BV questionnaires and sub-sample 2 (for CFA) comprised 430 FWC-BV questionnaires. All other analyses are based on the total sample of 2425 patients.
Using IBM SPSS Statistics for Windows, Version 25 (2017), randomisation of the total sample was done with the Select function (approximate 50%). Relationships between variables were estimated by Pearson Product-Moment Correlation, and scale reliability was estimated by McDonald’s Omega (ωt) (51, 52). EFA and CFA were conducted using Mplus 7.11 (Muthén & Muthén, Los Angeles, CA, USA) (53), with estimations based on the maximum likelihood (ML) and maximum likelihood mean (MLM) adjusted functions, respectively. The mean-adjusted chi-square test statistic, also referred to as the Satorra–Bentler chi-square (54) is robust to non-normality.
To evaluate the CFA models, goodness of fit was estimated by root mean square error of approximation (RMSEA) (55), the non-normed fit index (NFI) (56)—also called the Tucker Lewis Index (TLI) (57)—the comparative fit index (CFI) (58) and the standardised root mean square residual (SRMR) (59).
An RMSEA of 0.05 or below indicates a good model fit, values between 0.05 and 0.08 indicate a reasonable fit, values between 0.08 and 0.10 indicate a mediocre fit and values above 0.10 indicate an unacceptable fit (60). However, a cut-off value close to 0.06 (59) or a stringent upper limit of 0.07 (61) seem to be the general consensus of what is considered acceptable. The TLI and CFI both measure model fit in comparison to the independence model. Both are derived from the chi-square statistic and are supposed to lie between 0 and 1. Values greater than 0.90 for these measures are normally required for good fit of a model, although Hu and Bentler (59) have suggested TLI ≥ 0.95 as the threshold. The SRMR is the mean absolute value of the covariance residuals, and it ranges from 0 to 1. Well-fitting models should obtain values less than 0.05 (62, 63), but values up to 0.08 are acceptable (59).