Incidence, mortality and survival in children and young people aged 11-25 in Wales with co-occurring mental disorders and problem, hazardous or harmful substance use: estimates derived from linked routine data

Background Mental disorder (MD) and problem, hazardous or harmful substance use (SUD) are associated with poorer than average health and greater mortality. We analysed routine data to estimate incidence of co-occurring (CC) MD and SUD, and to estimate all-cause mortality and survival with CC, a single MD or SUD diagnosis or neither condition (NC), in young people aged 11-25 in Wales, UK. Methods A retrospective population-based electronic cohort study using data from the Secure Anonymised Information Linkage (SAIL) Databank. Participants were 958,603 individuals aged 11-25 between 2008 and 2017, with a subset for mortality and survival analysis of 465,242 individuals born between 1983 and 1997 and present in the data on 1 st January 2008. Incidence was dened as date of rst recorded occurrence of a CC code. Incidence and observed unadjusted mortality were reported as rates per 1,000 person-years at risk (PYAR). We plotted Kaplan-Meier survival curves and carried out Cox regression to estimate hazard ratios for risk of death by condition group (CC; MD or SUD only; NC). Results CC incidence in primary care signicantly decreased, from 2.5/1,000 PYAR (95% CI 2.3-2.6) in 2008 to 2.1/1,000 (95% CI 2.0-2.2) in 2017 (Incidence rate ratio (IRR) = 0.9, 95% CI 0.8-1.0, p=0.01), and in hospital admissions remained stable, from 2.3/1,000 (95% CI 2.1-2.4) in 2008 to 2.2/1,000 (95% CI 2.0-2.3) in 2017 (IRR = 1.0, 95% CI 0.9-1.1). Higher incidence was associated with male sex, older age and greater deprivation. Observed unadjusted mortality rates for CC (1.4/1,000 PYAR, 95% CI 1.2-1.5) and SUD only (1.1/1,000, 95% CI 0.9-1.4) were signicantly higher than for MD only (0.4/1,000, 95% CI 0.3-0.4)


Background
Mental disorders (MD) and problem, hazardous or harmful drug or alcohol use (SUD) together account for 7.4% of the global burden of disease and are the leading causes of years lived with disability (YLD) (1). They frequently co-occur (2); 44% of community mental health team service users report SUD, with 75% of drug service users and 85% of alcohol service users reporting one or more MD (3). MD and SUD are strongly associated with poorer than average health and greater risk of premature death (4). The terms 'co-occurring mental health and alcohol or drug use conditions' ('co-occurring conditions' or CC) and 'use' of substances are used in preference to 'Dual Diagnosis' and 'misuse', as they are more inclusive, not limited to formal diagnosis from a healthcare professional and include a range of reasons and methods for using substances (4), (5).
During the 1990s, prevalence of CC recorded in routine primary care data signi cantly increased (6). Recent studies have identi ed a complex epidemiological picture for MD and SUD in young people; incidence of anxiety and depression diagnosis is declining, but incidence of associated symptoms, and the prescription of antidepressants and anxiolytics, is increasing (7), (8), (9), (10). Survey data shows an increase in emotional disorders in young people, particularly older female adolescents [8]. Young people are drinking less alcohol (11), abstinence is increasing (12) and alcohol-related emergency admissions have decreased (13). However they are reporting greater levels of drug use (14), (15), and poisoning events associated with alcohol and opioids (including prescribed opioids) have increased, particularly among females using opioids. (16). Young people are at signi cantly increased risk of death or further emergency admission in the 10 year period following a drug or alcohol-related hospital admission (17) and SUD is a signi cant risk factor for progression to suicidal behaviour in young people who self-harm or express suicidal thoughts (18).
The authors are not aware of any recent studies using routine health data in the UK to examine trends and outcomes for children and young people with CC, and NICE has identi ed a need for research in this area (19). The aims of this study were to use routine health data from primary care, inpatient episodes and death registrations to estimate rst recorded incidence of CC in children and young people aged [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25] in Wales, UK, to estimate all-cause mortality rate and 10 year survival with CC in this population and to compare survival and mortality for individuals with codes for either CC, a record of either MD or SUD or no relevant codes recorded.

Methods
Design A retrospective population-based electronic cohort study was conducted using linked routine primary care, hospital inpatient admissions and mortality data.

Data source
The data source for this study was Secure Anonymised Information Linkage (SAIL) Databank, a secure repository established and managed by Swansea University Medical School. It houses anonymised health and related datasets, which can be linked for research purposes (20), (21). Datasets (Table 1) were prepared within the Adolescent Mental Health Data Platform (22).  (11), (23), (24) we compiled a list of SUD-related Read v2 codes, including diagnoses, symptoms, observations, medications, behaviours (e.g. 'injecting drug user'), referrals and contacts with other services. We included codes for alcohol or illegal drugs but excluded tobacco, in keeping with similar studies (6,25). We included codes designating MD due to substance use, which were classi ed as CC without requiring the presence of a second MD or SUD code (for example Read v2 codes in section Eu%, designating 'Mental and behavioural disorders due to psychoactive substance use'): this included codes for mental and behavioural disorders due to acute intoxication, as there is an association between contact with services for acute intoxication and subsequent suicide risk (26).
We included only those prescriptions relevant to treatment for substance use, and excluded those used primarily for pain management. We included disul ram, naltrexone, lofexidine, acamprosate and methadone, as almost all recipients had a history of SUD. For buprenorphine we included only those Read v2 codes where 10% or fewer recipients had no history of SUD. We excluded alcohol Read v2 codes requiring an associated value of units, because we could not be con dent that on their own these codes denote SUD.
International Classi cation of Diseases (ICD-10) codes: problem, harmful or hazardous substance use and co-occurring conditions ICD-10 codes (27) were initially identi ed by cross-mapping with SUD Read v2 codes. We then searched the literature to identify any additional codes (23), (24), (28), (29), (30): these were cross-mapped and added to the Read code list, to ensure consistency. As with Read v2 codes, ICD-10 codes designating MD due to substance use were classi ed as CC.
Read v2 and ICD-10 codes: mental disorders MD codes were sourced from the Adolescent Mental Health Data Platform (ADP) Concept Library (22). We included codes for depression, anxiety, severe mental illness (SMI; schizophrenia, schizotypal and delusional disorders, bipolar disorder, other mood-related disorders and other severe mental illness) (31),  (34) and developmental disorders (35). Codes included both diagnoses of conditions and associated symptoms, but not prescriptions associated with these conditions.
All code lists can be found in Additional File S1.

Factors and covariates
We obtained data on factors and covariates for age, sex, and Welsh Index of Multiple Deprivation (WIMD) 2011 quintile, an area-based measure of relative deprivation in Wales (36). We divided age into four groups; 11-14, 15-17, 18-21 and 22-25 years of age (collapsed into two groups; 11-17 and 18-25, where numbers were too low to report). Age was de ned at the end of each reporting year for incidence and at the start of the study window for mortality and survival. Individuals with null or contradictory indicators for sex were excluded. WIMD was derived from the 2001 census Local Super Output Area (LSOA) in which individuals were registered at the end of each year (or next nearest available record) for incidence, and at the start of follow-up period (or nearest available record) for mortality and survival.
Analysis methods: incidence

Individuals included
Using WDSD as the primary population, we identi ed individuals having their 11th -25th birthdays between 1st January 2008 and 31st December 2017 (7), (10). We included only periods during which individuals were registered with a SAIL supplying GP practice. For analysis of WLGP data, we excluded the rst six months of each GP registration period, to minimise the designation of prevalent cases as new incident cases due to re-recording of patient history when individuals move between GP practices (7), (10). We did not apply this exclusion to the inpatient data, as there is no retrospective coding in inpatients. The data collection start date was therefore the latest of; SAIL GP registration start date (plus six months for WLGP data); rst day of 11th birthday year or 1st Jan 2008. The data collection end date was the earliest of SAIL GP registration end date; last day of 25th birthday year, date of death or 31st December 2017. An individual could contribute more than one period of data; for example, where they had moved between SAIL and non-SAIL GP practices or migrated out of Wales and subsequently returned. The denominator for incidence was person years at risk (PYAR), to re ect individuals present in the data for only part of a year (6), (7), (10).

MD and SUD indicators
Incident cases were identi ed separately in primary care data (WLGP) and inpatient data (PEDW) using Read v2 and ICD-10 code lists. We excluded codes designating a history of a particular condition, as they do not distinguish between ongoing and historical conditions.

Incidence measures
First recorded incidence was de ned as the date of the rst occurrence in the patient history of a CC code, or in the absence of such a code, the latter of the rst MD or the rst SUD code (the rst of which could appear at any time in the patient history). An incident event was recorded only once for each individual, regardless of how many periods of data they contributed to the study population.
We plotted annual rst recorded incidence rates to describe trends over time. Poisson regression, with an offset allowing for comparison of rates, was undertaken to model counts of CC incidence by year, sex, age band and WIMD quintile. The degree of over-dispersion was estimated using the quasi-Poisson method (37) and as the data were found to be over-dispersed, standard errors from the quasi-Poisson model were used to derive 95% con dence intervals. Rates were reported as annual incidence per 1,000 PYAR and incidence rate ratios (IRR).

Analysis methods: mortality
We extracted from the incidence cohort a subset of individuals born between 1983 and 1997 and registered with a SAIL-supplying GP practice on 1st January 2008. We followed these individuals for 10 years, from 1st January 2008 to 31st December 2017. Therefore, the oldest age cohort, (those born in 1983), was followed up from the year of their 25th birthday to year of their 34th birthday and the youngest age cohort (those born in 1997) was followed up from the year of their 11th birthday to the year of their 20th birthday. In this cohort each individual provided only one period of data; the start date of follow-up was 1st January 2008 and the end date was the earliest of death, 31st December 2017 or last date of registration with a SAIL-supplying GP practice (date of loss to follow-up).
We searched the patient record to identify MD, SUD and CC codes occurring at any time between birth and end of follow-up, including codes designating a history of a particular condition. Using the ONS Annual District Deaths Extract (ADDE) we identi ed individuals who had died during the study window. We compared the proportion of deaths among those with a history of CC, either SUD or MD, and neither SUD or MD (NC). We calculated observed unadjusted mortality rates per 1,000 PYAR for each condition group, by age, sex and WIMD quintile.
We included individuals with no prior history of SUD, who died following a single episode involving use of a substance, in either the SUD or CC groups (depending on the codes in their history). We carried out a sensitivity analysis examining the impact of designating these individuals as NC.
Analysis methods: survival Using the cohort of individuals present in SAIL on 1st January 2008, we estimated survival from start of follow up time (1st January 2008); the outcome variable was death. The exposure variable was condition group (NC; MD only; SUD only; CC). We right censored follow up time to the earliest of data collection end date or end of follow up. We plotted Kaplan Meier survival curves, with signi cance of difference assessed by log rank tests. We performed Cox regression to derive hazard ratios comparing risk of all-cause death for individuals with CC in their history with those with SUD or MD only and those with NC, adjusted for sex, WIMD quintile and age band at start of follow-up. We tested the proportional hazards assumption by plotting Schoenfeld residuals. We then repeated the analysis with condition group as a time-dependent variable (as recording of codes could occur at any time), WIMD quintile as a two-level group (60% least deprived; 40% most deprived) and age at start of follow-up as a continuous instead of a categorical variable (38).

Results
Study populations      The association between higher incidence and increasing age was stronger for primary care than for hospital admissions, with rates in WLGP lower than in PEDW in the youngest age band but higher in the oldest; this was evident in greater IRRs in WLGP between age bands.
Higher incidence was associated with greater deprivation; the lowest incidence rates were among the least deprived quintile (WLGP = 1.1/1,000, 95% CI 1. and a smaller, non-signi cant increase in the least deprived quintile; in 2008, incidence was 4.5/1,000 (95% CI 4.1-4.9) in the most deprived quintile and 1.0/1,000 (95% CI 0.8-1.2) in the least deprived quintile; by 2017 incidence was 3.0/1,000 (95% CI 2.7-3.4) in the most deprived quintile and 1.4/1,000 (95% CI 1.1-1.7) in the least deprived quintile. This was not observed to the same extent in PEDW. Of 392 deaths among the SUD only and CC groups, we identi ed six who died in hospital with no prior history of SUD before their nal admission.
Reclassifying these as NC in the analysis made no signi cant difference.
Survival Figures 5-11 show plots of Kaplan-Meier survival curves with p-values derived from Log Rank tests, by condition group, sex, age band and WIMD quintile. Due to risk of statistical disclosure arising from small counts, the curves for SUD only were excluded from Figs. 6-11. To further prevent statistical disclosure, age band and WIMD quintile were collapsed to two levels (11-17 and 18-25; least deprived 60%, or quintiles 1-3 and most deprived 40%, or quintiles 4 and 5).
Survival was signi cantly different for individuals with CC, NC or MD only, for both males and females (p < 0.0001, Fig. 5). Figures 6-11 show that survival for males was signi cantly lower than for females in all condition groups and in both age bands at p < 0.0001, and for 11-17 year olds with CC at p < 0.05. The group who were 18-25 at follow-up start had signi cantly lower survival for all conditions (all signi cant at p < 0.01) except females with NC where there was no signi cant difference by age. Results by WIMD group were mixed; survival for both males and females with NC was signi cantly lower for the more deprived group (females = p < 0.05; males = p < 0.001). Differences in survival between the least and most deprived females with MD only and CC, and between the least and most deprived males with CC were not signi cant; differences between the least and most deprived males with MD only were signi cant at p < 0.05. Figure 12 summarises the results of a Cox regression with death from all cause as the outcome. Results showed that compared to the NC group, the risk of death during the study window was signi cantly higher for individuals with MD only (HR = 2.7, 95% CI 2.4-3.1, p < 0.001), with SUD only (HR = 4.5, 95% CI 3.4-5.9, p < 0.001) and with CC (HR = 8.7, 95% CI 7.5-10.0, p < 0.001).

Discussion
Main ndings in the context of previous studies In keeping with previous studies we found a high degree of overlap between cases of MD and SUD (2), (1), (3), particularly for SUD in secondary care where almost 80% with SUD also had an MD, as shown in Table 2. The overlap for MD, particularly in primary care, was lower, with around 8% of those with MD also having a record of an SUD; this may re ect the large proportion of patients with MD who are managed in primary care without ever being admitted to hospital.
Incidence of CC in young people aged 11-25 between 2008 and 2017 was stable in secondary care and decreased in primary care, particularly for females and among 11-17 year olds, with signi cantly higher rates associated with male sex, increasing age and greater deprivation. Higher incidence was associated with male sex, older age and greater deprivation, as shown in Fig. 2, Fig. 3 and Table 4. Similar trends have been identi ed in studies using routine data to separately estimate incidence or prevalence of MD (7), (9), (10), (31) and SUD (11), (13). The gap in primary care incidence rates between the most and least deprived quintiles has reduced, due to a reduction in incidence in the most deprived quintile.
Observed unadjusted mortality was signi cantly higher among individuals with a diagnosis of CC, and to a lesser extent among those with a diagnosis of SUD or MD only, than among individuals with NC, as shown in Fig. 4. Survival was signi cantly lower for individuals with CC, particularly for males and those in the older age band at follow-up start, as shown in Figs. 6-11. Compared to the NC group, the hazard ratio for death was 8.7 times greater in the CC group, 4.5 times greater in the SUD only group and 2.7 times greater in the MD only group, as shown in Fig. 12. Alcohol and drug use have been shown to commonly precede suicide (39). Our ndings are consistent with previous studies suggesting individuals with a history of alcohol use disorder are at signi cantly increased risk of death (26), even in the absence of a co-occurring MD (40). MD (particularly with comorbid SUD) is associated with all-cause mortality rates signi cantly higher than those for the general population: as well as the inherent risk of death directly attributable to substance use, there may be greater medical morbidity, which is not always well recognised by service providers (41). There is a well-established association between deprivation, male sex and increased risk of death (42). Higher mortality but lower contact with services among males may indicate greater unmet need in this group, although no association can be assumed without further analysis by speci c cause of death.

Strengths and limitations
This was a large-scale population study using linked routine health data comprising the records of nearly one million participants in Wales, providing a su ciently large number of outcomes (CC cases and deaths) to support our estimations. We used the ONS ADDE to ascertain date of death, which is a nearcomplete record and is considered the gold standard for death records (43). Although the SAIL Databank dataset holds records for 77% of GP practices (and 79% of the current population) in Wales, the data in SAIL is broadly representative of the Welsh population in terms of sex, age and deprivation. Routine data may vary in quality between sources, and this may affect dataset linkage; to mitigate this we used only those records where there was su cient level of con dence in matching quality (21).
Alcohol use disorders, particularly hazardous and harmful drinking (as opposed to dependent drinking) are under-recorded by GPs, particularly for men and younger people (44). This is also likely to be the case for illicit drug use (45), (46). Rates of recording may vary over time or between GP practices, due to experience, training, practice protocols and government policies (47). The exclusion of codes relating to consumption levels may also mean that some individuals with problematic, hazardous or harmful alcohol consumption are not detected. This means that estimated rates of SUD derived from routine primary care data should be considered as a minimum. The analysis should be interpreted as examining coding behaviour as much as clinical indicators (33).
The identi cation of cases within this study is limited by the availability of full patient history in the WLGP and PEDW datasets. We did not include individuals attending Emergency Departments; inclusion of this dataset would very likely increase the incidence of CC as it would include individuals not admitted to hospital and those who are reluctant to seek help from their GP. Incident cases are de ned as the rst recorded occurrence of a code, but we cannot be certain that these events genuinely represent the onset of a condition (48). The rates presented are therefore a measure of contacts with services (49).
We estimated mortality and survival for death from all causes, and did not consider speci c causes. SUD and MD are (both individually and in combination) associated with an increased risk of death from speci c causes such as suicide, as well as deaths from natural causes (39), (26), (40), (41), (17), (23), (32).
We did not include personality disorders (PD) in our de nition of MD, although PD commonly co-occurs with SUD (50); this is because SUD is considered a diagnostic criterion for borderline personality disorder (51). We grouped together use of alcohol and drugs, and did not consider the impact of speci c substances, the severity of usage or the impact of using speci c combinations of substances. We have included SUD codes indicating varying degrees of severity; for example we included as CC all episodes with codes for mental or behavioural disorders due to psychoactive substance use, which includes episodes of acute intoxication "resulting in disturbances in level of consciousness, cognition, perception, affect or behaviour" (27).

Policy, research and practice implications
Individuals who have had contact with primary care or inpatient services related to CC (as well as those with SUD or MD only) in their patient history are at signi cantly increased risk of death; these contacts may offer an opportunity to identify particularly vulnerable individuals in need of specialist intervention. CC incidence rates for younger age bands were lower in primary care than in hospital admissions, which was unexpected, given that GP practices should receive and record noti cation of any inpatient admissions and that primary care may be the rst place individuals turn to for help with SUD (47). This nding supports existing evidence of under-recording of SUD in primary care (but in this instance may relate to the recording of SUD, MD or both). There are well documented sensitivities about discussing and recording SUD in primary care (47) which may be ampli ed for younger patients. Survival and mortality rates were signi cantly poorer for individuals with CC, but were also signi cantly worse for individuals with SUD only, suggesting that SUD (with or without cooccurring MD) is a key risk factor, particularly for males. Alternatively this may be due to undiagnosed MD among substance users.
This study did not consider subcategories of death; however it is likely that risks of natural and unnatural death (particularly suicide) are not equal, and are affected by the presence or absence of CC. This may also be the case for risk of non-lethal self-harm among individuals with CC, which was not considered in this study. Risk may vary according to the type and combination of substance used, particularly whether both alcohol and drugs are used.

Conclusion
CC signi cantly increases the risk of death in children and young people aged 11-25. Incidence of CC in children and young people in Wales between 2008 and 2017 decreased in primary care and remained stable in secondary care, with signi cantly higher incidence associated with male sex, increasing age and greater deprivation. Mortality was signi cantly higher among individuals with a diagnosis of CC, and to a lesser extent among those with a diagnosis of SUD or MD only, compared with individuals with NC. The higher mortality rate for individuals with SUD (with or without mental disorder) may indicate substance use as a key risk factor, or alternatively may be indicative of unrecorded mental disorder in substance using individuals.  Observed unadjusted mortality rate/1000 PYAR for deaths (all cause) -overall, by sex, age at start of follow-up and WIMD quintile Figure 5 Kaplan Meier survival curve -by sex, strati ed by condition group Figure 6 Kaplan Meier survival curve -by age and condition group, strati ed by sex Kaplan Meier survival curve -by sex and condition group, strati ed by age Kaplan Meier survival curve -by sex and age, strati ed by condition group Figure 9 Kaplan Meier survival curve -by WIMD group and condition group, strati ed by sex Figure 10 Kaplan Meier survival curve -by sex and condition group, strati ed by WIMD group Figure 11 Kaplan Meier survival curve -by WIMD group and sex, strati ed by condition group Figure 12 Cox regression -by condition group, sex, age band at start of follow-up and WIMD group

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.